Real-time Serving¶

These dataclasses are used in the SDK to represent API requests and responses for services in the databricks.sdk.service.serving module.

class databricks.sdk.service.serving.Ai21LabsConfig¶

ai21labs_api_key: str¶: The Databricks secret key reference for an AI21Labs API key.

as_dict() → dict¶: Serializes the Ai21LabsConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → Ai21LabsConfig¶: Deserializes the Ai21LabsConfig from a dictionary.

class databricks.sdk.service.serving.AmazonBedrockConfig¶

aws_region: str¶: The AWS region to use. Bedrock has to be enabled there.

aws_access_key_id: str¶: The Databricks secret key reference for an AWS Access Key ID with permissions to interact with Bedrock services.

aws_secret_access_key: str¶: The Databricks secret key reference for an AWS Secret Access Key paired with the access key ID, with permissions to interact with Bedrock services.

bedrock_provider: AmazonBedrockConfigBedrockProvider¶: The underlying provider in Amazon Bedrock. Supported values (case insensitive) include: Anthropic, Cohere, AI21Labs, Amazon.

as_dict() → dict¶: Serializes the AmazonBedrockConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AmazonBedrockConfig¶: Deserializes the AmazonBedrockConfig from a dictionary.

class databricks.sdk.service.serving.AmazonBedrockConfigBedrockProvider¶

The underlying provider in Amazon Bedrock. Supported values (case insensitive) include: Anthropic, Cohere, AI21Labs, Amazon.

AI21LABS = "AI21LABS"¶

AMAZON = "AMAZON"¶

ANTHROPIC = "ANTHROPIC"¶

COHERE = "COHERE"¶

class databricks.sdk.service.serving.AnthropicConfig¶

anthropic_api_key: str¶: The Databricks secret key reference for an Anthropic API key.

as_dict() → dict¶: Serializes the AnthropicConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AnthropicConfig¶: Deserializes the AnthropicConfig from a dictionary.

class databricks.sdk.service.serving.App¶

name: str¶: The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

active_deployment: AppDeployment | None = None¶: The active deployment of the app.

create_time: str | None = None¶: The creation time of the app. Formatted timestamp in ISO 6801.

creator: str | None = None¶: The email of the user that created the app.

description: str | None = None¶: The description of the app.

pending_deployment: AppDeployment | None = None¶: The pending deployment of the app.

status: AppStatus | None = None¶

update_time: str | None = None¶: The update time of the app. Formatted timestamp in ISO 6801.

updater: str | None = None¶: The email of the user that last updated the app.

url: str | None = None¶: The URL of the app once it is deployed.

as_dict() → dict¶: Serializes the App into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → App¶: Deserializes the App from a dictionary.

class databricks.sdk.service.serving.AppDeployment¶

source_code_path: str¶: The source code path of the deployment.

create_time: str | None = None¶: The creation time of the deployment. Formatted timestamp in ISO 6801.

creator: str | None = None¶: The email of the user creates the deployment.

deployment_id: str | None = None¶: The unique id of the deployment.

status: AppDeploymentStatus | None = None¶: Status and status message of the deployment

update_time: str | None = None¶: The update time of the deployment. Formatted timestamp in ISO 6801.

as_dict() → dict¶: Serializes the AppDeployment into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AppDeployment¶: Deserializes the AppDeployment from a dictionary.

class databricks.sdk.service.serving.AppDeploymentState¶

CANCELLED = "CANCELLED"¶

FAILED = "FAILED"¶

IN_PROGRESS = "IN_PROGRESS"¶

STATE_UNSPECIFIED = "STATE_UNSPECIFIED"¶

SUCCEEDED = "SUCCEEDED"¶

class databricks.sdk.service.serving.AppDeploymentStatus¶

message: str | None = None¶: Message corresponding with the deployment state.

state: AppDeploymentState | None = None¶: State of the deployment.

as_dict() → dict¶: Serializes the AppDeploymentStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AppDeploymentStatus¶: Deserializes the AppDeploymentStatus from a dictionary.

class databricks.sdk.service.serving.AppEnvironment¶

env: List[EnvVariable] | None = None¶

as_dict() → dict¶: Serializes the AppEnvironment into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AppEnvironment¶: Deserializes the AppEnvironment from a dictionary.

class databricks.sdk.service.serving.AppState¶

CREATING = "CREATING"¶

DELETED = "DELETED"¶

DELETING = "DELETING"¶

DEPLOYED = "DEPLOYED"¶

DEPLOYING = "DEPLOYING"¶

ERROR = "ERROR"¶

IDLE = "IDLE"¶

READY = "READY"¶

RUNNING = "RUNNING"¶

STARTING = "STARTING"¶

STATE_UNSPECIFIED = "STATE_UNSPECIFIED"¶

UPDATING = "UPDATING"¶

class databricks.sdk.service.serving.AppStatus¶

message: str | None = None¶: Message corresponding with the app state.

state: AppState | None = None¶: State of the app.

as_dict() → dict¶: Serializes the AppStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AppStatus¶: Deserializes the AppStatus from a dictionary.

class databricks.sdk.service.serving.AutoCaptureConfigInput¶

catalog_name: str | None = None¶: The name of the catalog in Unity Catalog. NOTE: On update, you cannot change the catalog name if it was already set.

enabled: bool | None = None¶: If inference tables are enabled or not. NOTE: If you have already disabled payload logging once, you cannot enable again.

schema_name: str | None = None¶: The name of the schema in Unity Catalog. NOTE: On update, you cannot change the schema name if it was already set.

table_name_prefix: str | None = None¶: The prefix of the table in Unity Catalog. NOTE: On update, you cannot change the prefix name if it was already set.

as_dict() → dict¶: Serializes the AutoCaptureConfigInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AutoCaptureConfigInput¶: Deserializes the AutoCaptureConfigInput from a dictionary.

class databricks.sdk.service.serving.AutoCaptureConfigOutput¶

catalog_name: str | None = None¶: The name of the catalog in Unity Catalog.

enabled: bool | None = None¶: If inference tables are enabled or not.

schema_name: str | None = None¶: The name of the schema in Unity Catalog.

state: AutoCaptureState | None = None¶

table_name_prefix: str | None = None¶: The prefix of the table in Unity Catalog.

as_dict() → dict¶: Serializes the AutoCaptureConfigOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AutoCaptureConfigOutput¶: Deserializes the AutoCaptureConfigOutput from a dictionary.

class databricks.sdk.service.serving.AutoCaptureState¶

payload_table: PayloadTable | None = None¶

as_dict() → dict¶: Serializes the AutoCaptureState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → AutoCaptureState¶: Deserializes the AutoCaptureState from a dictionary.

class databricks.sdk.service.serving.BuildLogsResponse¶

logs: str¶: The logs associated with building the served entity’s environment.

as_dict() → dict¶: Serializes the BuildLogsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → BuildLogsResponse¶: Deserializes the BuildLogsResponse from a dictionary.

class databricks.sdk.service.serving.ChatMessage¶

content: str | None = None¶: The content of the message.

role: ChatMessageRole | None = None¶: The role of the message. One of [system, user, assistant].

as_dict() → dict¶: Serializes the ChatMessage into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ChatMessage¶: Deserializes the ChatMessage from a dictionary.

class databricks.sdk.service.serving.ChatMessageRole¶

The role of the message. One of [system, user, assistant].

ASSISTANT = "ASSISTANT"¶

SYSTEM = "SYSTEM"¶

USER = "USER"¶

class databricks.sdk.service.serving.CohereConfig¶

cohere_api_key: str¶: The Databricks secret key reference for a Cohere API key.

as_dict() → dict¶: Serializes the CohereConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CohereConfig¶: Deserializes the CohereConfig from a dictionary.

class databricks.sdk.service.serving.CreateAppDeploymentRequest¶

source_code_path: str¶: The source code path of the deployment.

app_name: str | None = None¶: The name of the app.

as_dict() → dict¶: Serializes the CreateAppDeploymentRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateAppDeploymentRequest¶: Deserializes the CreateAppDeploymentRequest from a dictionary.

class databricks.sdk.service.serving.CreateAppRequest¶

name: str¶: The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

description: str | None = None¶: The description of the app.

as_dict() → dict¶: Serializes the CreateAppRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateAppRequest¶: Deserializes the CreateAppRequest from a dictionary.

class databricks.sdk.service.serving.CreateServingEndpoint¶

name: str¶: The name of the serving endpoint. This field is required and must be unique across a Databricks workspace. An endpoint name can consist of alphanumeric characters, dashes, and underscores.

config: EndpointCoreConfigInput¶: The core config of the serving endpoint.

rate_limits: List[RateLimit] | None = None¶: Rate limits to be applied to the serving endpoint. NOTE: only external and foundation model endpoints are supported as of now.

tags: List[EndpointTag] | None = None¶: Tags to be attached to the serving endpoint and automatically propagated to billing logs.

as_dict() → dict¶: Serializes the CreateServingEndpoint into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateServingEndpoint¶: Deserializes the CreateServingEndpoint from a dictionary.

class databricks.sdk.service.serving.DatabricksModelServingConfig¶

databricks_api_token: str¶: The Databricks secret key reference for a Databricks API token that corresponds to a user or service principal with Can Query access to the model serving endpoint pointed to by this external model.

databricks_workspace_url: str¶: The URL of the Databricks workspace containing the model serving endpoint pointed to by this external model.

as_dict() → dict¶: Serializes the DatabricksModelServingConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DatabricksModelServingConfig¶: Deserializes the DatabricksModelServingConfig from a dictionary.

class databricks.sdk.service.serving.DataframeSplitInput¶

columns: List[Any] | None = None¶

data: List[Any] | None = None¶

index: List[int] | None = None¶

as_dict() → dict¶: Serializes the DataframeSplitInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DataframeSplitInput¶: Deserializes the DataframeSplitInput from a dictionary.

class databricks.sdk.service.serving.DeleteResponse¶

as_dict() → dict¶: Serializes the DeleteResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteResponse¶: Deserializes the DeleteResponse from a dictionary.

class databricks.sdk.service.serving.EmbeddingsV1ResponseEmbeddingElement¶

embedding: List[float] | None = None¶

index: int | None = None¶: The index of the embedding in the response.

object: EmbeddingsV1ResponseEmbeddingElementObject | None = None¶: This will always be ‘embedding’.

as_dict() → dict¶: Serializes the EmbeddingsV1ResponseEmbeddingElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EmbeddingsV1ResponseEmbeddingElement¶: Deserializes the EmbeddingsV1ResponseEmbeddingElement from a dictionary.

class databricks.sdk.service.serving.EmbeddingsV1ResponseEmbeddingElementObject¶

This will always be ‘embedding’.

EMBEDDING = "EMBEDDING"¶

class databricks.sdk.service.serving.EndpointCoreConfigInput¶

auto_capture_config: AutoCaptureConfigInput | None = None¶: Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

name: str | None = None¶: The name of the serving endpoint to update. This field is required.

served_entities: List[ServedEntityInput] | None = None¶: A list of served entities for the endpoint to serve. A serving endpoint can have up to 15 served entities.

served_models: List[ServedModelInput] | None = None¶: (Deprecated, use served_entities instead) A list of served models for the endpoint to serve. A serving endpoint can have up to 15 served models.

traffic_config: TrafficConfig | None = None¶: The traffic config defining how invocations to the serving endpoint should be routed.

as_dict() → dict¶: Serializes the EndpointCoreConfigInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointCoreConfigInput¶: Deserializes the EndpointCoreConfigInput from a dictionary.

class databricks.sdk.service.serving.EndpointCoreConfigOutput¶

auto_capture_config: AutoCaptureConfigOutput | None = None¶: Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

config_version: int | None = None¶: The config version that the serving endpoint is currently serving.

served_entities: List[ServedEntityOutput] | None = None¶: The list of served entities under the serving endpoint config.

served_models: List[ServedModelOutput] | None = None¶: (Deprecated, use served_entities instead) The list of served models under the serving endpoint config.

traffic_config: TrafficConfig | None = None¶: The traffic configuration associated with the serving endpoint config.

as_dict() → dict¶: Serializes the EndpointCoreConfigOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointCoreConfigOutput¶: Deserializes the EndpointCoreConfigOutput from a dictionary.

class databricks.sdk.service.serving.EndpointCoreConfigSummary¶

served_entities: List[ServedEntitySpec] | None = None¶: The list of served entities under the serving endpoint config.

served_models: List[ServedModelSpec] | None = None¶: (Deprecated, use served_entities instead) The list of served models under the serving endpoint config.

as_dict() → dict¶: Serializes the EndpointCoreConfigSummary into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointCoreConfigSummary¶: Deserializes the EndpointCoreConfigSummary from a dictionary.

class databricks.sdk.service.serving.EndpointPendingConfig¶

auto_capture_config: AutoCaptureConfigOutput | None = None¶: Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

config_version: int | None = None¶: The config version that the serving endpoint is currently serving.

served_entities: List[ServedEntityOutput] | None = None¶: The list of served entities belonging to the last issued update to the serving endpoint.

served_models: List[ServedModelOutput] | None = None¶: (Deprecated, use served_entities instead) The list of served models belonging to the last issued update to the serving endpoint.

start_time: int | None = None¶: The timestamp when the update to the pending config started.

traffic_config: TrafficConfig | None = None¶: The traffic config defining how invocations to the serving endpoint should be routed.

as_dict() → dict¶: Serializes the EndpointPendingConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointPendingConfig¶: Deserializes the EndpointPendingConfig from a dictionary.

class databricks.sdk.service.serving.EndpointState¶

config_update: EndpointStateConfigUpdate | None = None¶: The state of an endpoint’s config update. This informs the user if the pending_config is in progress, if the update failed, or if there is no update in progress. Note that if the endpoint’s config_update state value is IN_PROGRESS, another update can not be made until the update completes or fails.

ready: EndpointStateReady | None = None¶: The state of an endpoint, indicating whether or not the endpoint is queryable. An endpoint is READY if all of the served entities in its active configuration are ready. If any of the actively served entities are in a non-ready state, the endpoint state will be NOT_READY.

as_dict() → dict¶: Serializes the EndpointState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointState¶: Deserializes the EndpointState from a dictionary.

class databricks.sdk.service.serving.EndpointStateConfigUpdate¶

The state of an endpoint’s config update. This informs the user if the pending_config is in progress, if the update failed, or if there is no update in progress. Note that if the endpoint’s config_update state value is IN_PROGRESS, another update can not be made until the update completes or fails.

IN_PROGRESS = "IN_PROGRESS"¶

NOT_UPDATING = "NOT_UPDATING"¶

UPDATE_FAILED = "UPDATE_FAILED"¶

class databricks.sdk.service.serving.EndpointStateReady¶

The state of an endpoint, indicating whether or not the endpoint is queryable. An endpoint is READY if all of the served entities in its active configuration are ready. If any of the actively served entities are in a non-ready state, the endpoint state will be NOT_READY.

NOT_READY = "NOT_READY"¶

READY = "READY"¶

class databricks.sdk.service.serving.EndpointTag¶

key: str¶: Key field for a serving endpoint tag.

value: str | None = None¶: Optional value field for a serving endpoint tag.

as_dict() → dict¶: Serializes the EndpointTag into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointTag¶: Deserializes the EndpointTag from a dictionary.

class databricks.sdk.service.serving.EnvVariable¶

name: str | None = None¶

value: str | None = None¶

value_from: str | None = None¶

as_dict() → dict¶: Serializes the EnvVariable into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EnvVariable¶: Deserializes the EnvVariable from a dictionary.

class databricks.sdk.service.serving.ExportMetricsResponse¶

as_dict() → dict¶: Serializes the ExportMetricsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ExportMetricsResponse¶: Deserializes the ExportMetricsResponse from a dictionary.

class databricks.sdk.service.serving.ExternalModel¶

provider: ExternalModelProvider¶: The name of the provider for the external model. Currently, the supported providers are ‘ai21labs’, ‘anthropic’, ‘amazon-bedrock’, ‘cohere’, ‘databricks-model-serving’, ‘openai’, and ‘palm’.”,

name: str¶: The name of the external model.

task: str¶: The task type of the external model.

ai21labs_config: Ai21LabsConfig | None = None¶: AI21Labs Config. Only required if the provider is ‘ai21labs’.

amazon_bedrock_config: AmazonBedrockConfig | None = None¶: Amazon Bedrock Config. Only required if the provider is ‘amazon-bedrock’.

anthropic_config: AnthropicConfig | None = None¶: Anthropic Config. Only required if the provider is ‘anthropic’.

cohere_config: CohereConfig | None = None¶: Cohere Config. Only required if the provider is ‘cohere’.

databricks_model_serving_config: DatabricksModelServingConfig | None = None¶: Databricks Model Serving Config. Only required if the provider is ‘databricks-model-serving’.

openai_config: OpenAiConfig | None = None¶: OpenAI Config. Only required if the provider is ‘openai’.

palm_config: PaLmConfig | None = None¶: PaLM Config. Only required if the provider is ‘palm’.

as_dict() → dict¶: Serializes the ExternalModel into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ExternalModel¶: Deserializes the ExternalModel from a dictionary.

class databricks.sdk.service.serving.ExternalModelProvider¶

The name of the provider for the external model. Currently, the supported providers are ‘ai21labs’, ‘anthropic’, ‘amazon-bedrock’, ‘cohere’, ‘databricks-model-serving’, ‘openai’, and ‘palm’.”,

AI21LABS = "AI21LABS"¶

AMAZON_BEDROCK = "AMAZON_BEDROCK"¶

ANTHROPIC = "ANTHROPIC"¶

COHERE = "COHERE"¶

DATABRICKS_MODEL_SERVING = "DATABRICKS_MODEL_SERVING"¶

OPENAI = "OPENAI"¶

PALM = "PALM"¶

class databricks.sdk.service.serving.ExternalModelUsageElement¶

completion_tokens: int | None = None¶: The number of tokens in the chat/completions response.

prompt_tokens: int | None = None¶: The number of tokens in the prompt.

total_tokens: int | None = None¶: The total number of tokens in the prompt and response.

as_dict() → dict¶: Serializes the ExternalModelUsageElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ExternalModelUsageElement¶: Deserializes the ExternalModelUsageElement from a dictionary.

class databricks.sdk.service.serving.FoundationModel¶

description: str | None = None¶: The description of the foundation model.

display_name: str | None = None¶: The display name of the foundation model.

docs: str | None = None¶: The URL to the documentation of the foundation model.

name: str | None = None¶: The name of the foundation model.

as_dict() → dict¶: Serializes the FoundationModel into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → FoundationModel¶: Deserializes the FoundationModel from a dictionary.

class databricks.sdk.service.serving.GetOpenApiResponse¶

The response is an OpenAPI spec in JSON format that typically includes fields like openapi, info, servers and paths, etc.

as_dict() → dict¶: Serializes the GetOpenApiResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → GetOpenApiResponse¶: Deserializes the GetOpenApiResponse from a dictionary.

class databricks.sdk.service.serving.GetServingEndpointPermissionLevelsResponse¶

permission_levels: List[ServingEndpointPermissionsDescription] | None = None¶: Specific permission levels

as_dict() → dict¶: Serializes the GetServingEndpointPermissionLevelsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → GetServingEndpointPermissionLevelsResponse¶: Deserializes the GetServingEndpointPermissionLevelsResponse from a dictionary.

class databricks.sdk.service.serving.ListAppDeploymentsResponse¶

app_deployments: List[AppDeployment] | None = None¶: Deployment history of the app.

next_page_token: str | None = None¶: Pagination token to request the next page of apps.

as_dict() → dict¶: Serializes the ListAppDeploymentsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ListAppDeploymentsResponse¶: Deserializes the ListAppDeploymentsResponse from a dictionary.

class databricks.sdk.service.serving.ListAppsResponse¶

apps: List[App] | None = None¶

next_page_token: str | None = None¶: Pagination token to request the next page of apps.

as_dict() → dict¶: Serializes the ListAppsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ListAppsResponse¶: Deserializes the ListAppsResponse from a dictionary.

class databricks.sdk.service.serving.ListEndpointsResponse¶

endpoints: List[ServingEndpoint] | None = None¶: The list of endpoints.

as_dict() → dict¶: Serializes the ListEndpointsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ListEndpointsResponse¶: Deserializes the ListEndpointsResponse from a dictionary.

class databricks.sdk.service.serving.OpenAiConfig¶

openai_api_key: str¶: The Databricks secret key reference for an OpenAI or Azure OpenAI API key.

openai_api_base: str | None = None¶: This is the base URL for the OpenAI API (default: “https://api.openai.com/v1”). For Azure OpenAI, this field is required, and is the base URL for the Azure OpenAI API service provided by Azure.

openai_api_type: str | None = None¶: This is an optional field to specify the type of OpenAI API to use. For Azure OpenAI, this field is required, and adjust this parameter to represent the preferred security access validation protocol. For access token validation, use azure. For authentication using Azure Active Directory (Azure AD) use, azuread.

openai_api_version: str | None = None¶: This is an optional field to specify the OpenAI API version. For Azure OpenAI, this field is required, and is the version of the Azure OpenAI service to utilize, specified by a date.

openai_deployment_name: str | None = None¶: This field is only required for Azure OpenAI and is the name of the deployment resource for the Azure OpenAI service.

openai_organization: str | None = None¶: This is an optional field to specify the organization in OpenAI or Azure OpenAI.

as_dict() → dict¶: Serializes the OpenAiConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → OpenAiConfig¶: Deserializes the OpenAiConfig from a dictionary.

class databricks.sdk.service.serving.PaLmConfig¶

palm_api_key: str¶: The Databricks secret key reference for a PaLM API key.

as_dict() → dict¶: Serializes the PaLmConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → PaLmConfig¶: Deserializes the PaLmConfig from a dictionary.

class databricks.sdk.service.serving.PatchServingEndpointTags¶

add_tags: List[EndpointTag] | None = None¶: List of endpoint tags to add

delete_tags: List[str] | None = None¶: List of tag keys to delete

name: str | None = None¶: The name of the serving endpoint who’s tags to patch. This field is required.

as_dict() → dict¶: Serializes the PatchServingEndpointTags into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → PatchServingEndpointTags¶: Deserializes the PatchServingEndpointTags from a dictionary.

class databricks.sdk.service.serving.PayloadTable¶

name: str | None = None¶: The name of the payload table.

status: str | None = None¶: The status of the payload table.

status_message: str | None = None¶: The status message of the payload table.

as_dict() → dict¶: Serializes the PayloadTable into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → PayloadTable¶: Deserializes the PayloadTable from a dictionary.

class databricks.sdk.service.serving.PutResponse¶

rate_limits: List[RateLimit] | None = None¶: The list of endpoint rate limits.

as_dict() → dict¶: Serializes the PutResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → PutResponse¶: Deserializes the PutResponse from a dictionary.

class databricks.sdk.service.serving.QueryEndpointInput¶

dataframe_records: List[Any] | None = None¶: Pandas Dataframe input in the records orientation.

dataframe_split: DataframeSplitInput | None = None¶: Pandas Dataframe input in the split orientation.

extra_params: Dict[str, str] | None = None¶: The extra parameters field used ONLY for __completions, chat,__ and __embeddings external & foundation model__ serving endpoints. This is a map of strings and should only be used with other external/foundation model query fields.

input: Any | None = None¶: The input string (or array of strings) field used ONLY for __embeddings external & foundation model__ serving endpoints and is the only field (along with extra_params if needed) used by embeddings queries.

inputs: Any | None = None¶: Tensor-based input in columnar format.

instances: List[Any] | None = None¶: Tensor-based input in row format.

max_tokens: int | None = None¶: The max tokens field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is an integer and should only be used with other chat/completions query fields.

messages: List[ChatMessage] | None = None¶: The messages field used ONLY for __chat external & foundation model__ serving endpoints. This is a map of strings and should only be used with other chat query fields.

n: int | None = None¶: The n (number of candidates) field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is an integer between 1 and 5 with a default of 1 and should only be used with other chat/completions query fields.

name: str | None = None¶: The name of the serving endpoint. This field is required.

prompt: Any | None = None¶: The prompt string (or array of strings) field used ONLY for __completions external & foundation model__ serving endpoints and should only be used with other completions query fields.

stop: List[str] | None = None¶: The stop sequences field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a list of strings and should only be used with other chat/completions query fields.

stream: bool | None = None¶: The stream field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a boolean defaulting to false and should only be used with other chat/completions query fields.

temperature: float | None = None¶: The temperature field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a float between 0.0 and 2.0 with a default of 1.0 and should only be used with other chat/completions query fields.

as_dict() → dict¶: Serializes the QueryEndpointInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → QueryEndpointInput¶: Deserializes the QueryEndpointInput from a dictionary.

class databricks.sdk.service.serving.QueryEndpointResponse¶

choices: List[V1ResponseChoiceElement] | None = None¶: The list of choices returned by the __chat or completions external/foundation model__ serving endpoint.

created: int | None = None¶: The timestamp in seconds when the query was created in Unix time returned by a __completions or chat external/foundation model__ serving endpoint.

data: List[EmbeddingsV1ResponseEmbeddingElement] | None = None¶: The list of the embeddings returned by the __embeddings external/foundation model__ serving endpoint.

id: str | None = None¶: The ID of the query that may be returned by a __completions or chat external/foundation model__ serving endpoint.

model: str | None = None¶: The name of the __external/foundation model__ used for querying. This is the name of the model that was specified in the endpoint config.

object: QueryEndpointResponseObject | None = None¶: The type of object returned by the __external/foundation model__ serving endpoint, one of [text_completion, chat.completion, list (of embeddings)].

predictions: List[Any] | None = None¶: The predictions returned by the serving endpoint.

served_model_name: str | None = None¶: The name of the served model that served the request. This is useful when there are multiple models behind the same endpoint with traffic split.

usage: ExternalModelUsageElement | None = None¶: The usage object that may be returned by the __external/foundation model__ serving endpoint. This contains information about the number of tokens used in the prompt and response.

as_dict() → dict¶: Serializes the QueryEndpointResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → QueryEndpointResponse¶: Deserializes the QueryEndpointResponse from a dictionary.

class databricks.sdk.service.serving.QueryEndpointResponseObject¶

The type of object returned by the __external/foundation model__ serving endpoint, one of [text_completion, chat.completion, list (of embeddings)].

CHAT_COMPLETION = "CHAT_COMPLETION"¶

LIST = "LIST"¶

TEXT_COMPLETION = "TEXT_COMPLETION"¶

class databricks.sdk.service.serving.RateLimit¶

calls: int¶: Used to specify how many calls are allowed for a key within the renewal_period.

renewal_period: RateLimitRenewalPeriod¶: Renewal period field for a serving endpoint rate limit. Currently, only ‘minute’ is supported.

key: RateLimitKey | None = None¶: Key field for a serving endpoint rate limit. Currently, only ‘user’ and ‘endpoint’ are supported, with ‘endpoint’ being the default if not specified.

as_dict() → dict¶: Serializes the RateLimit into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → RateLimit¶: Deserializes the RateLimit from a dictionary.

class databricks.sdk.service.serving.RateLimitKey¶

Key field for a serving endpoint rate limit. Currently, only ‘user’ and ‘endpoint’ are supported, with ‘endpoint’ being the default if not specified.

ENDPOINT = "ENDPOINT"¶

USER = "USER"¶

class databricks.sdk.service.serving.RateLimitRenewalPeriod¶

Renewal period field for a serving endpoint rate limit. Currently, only ‘minute’ is supported.

MINUTE = "MINUTE"¶

class databricks.sdk.service.serving.Route¶

served_model_name: str¶: The name of the served model this route configures traffic for.

traffic_percentage: int¶: The percentage of endpoint traffic to send to this route. It must be an integer between 0 and 100 inclusive.

as_dict() → dict¶: Serializes the Route into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → Route¶: Deserializes the Route from a dictionary.

class databricks.sdk.service.serving.ServedEntityInput¶

entity_name: str | None = None¶: The name of the entity to be served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object should be given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None¶: The version of the model in Databricks Model Registry to be served or empty if the entity is a FEATURE_SPEC.

environment_vars: Dict[str, str] | None = None¶: An object containing a set of optional, user-specified environment variable key-value pairs used for serving this entity. Note: this is an experimental feature and subject to change. Example entity environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

external_model: ExternalModel | None = None¶: The external model to be served. NOTE: Only one of external_model and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) can be specified with the latter set being used for custom model serving for a Databricks registered model. When an external_model is present, the served entities list can only have one served_entity object. For an existing endpoint with external_model, it can not be updated to an endpoint without external_model. If the endpoint is created without external_model, users cannot update it to add external_model later.

instance_profile_arn: str | None = None¶: ARN of the instance profile that the served entity uses to access AWS resources.

max_provisioned_throughput: int | None = None¶: The maximum tokens per second that the endpoint can scale up to.

min_provisioned_throughput: int | None = None¶: The minimum tokens per second that the endpoint can scale down to.

name: str | None = None¶: The name of a served entity. It must be unique across an endpoint. A served entity name can consist of alphanumeric characters, dashes, and underscores. If not specified for an external model, this field defaults to external_model.name, with ‘.’ and ‘:’ replaced with ‘-’, and if not specified for other entities, it defaults to <entity-name>-<entity-version>.

scale_to_zero_enabled: bool | None = None¶: Whether the compute resources for the served entity should scale down to zero.

workload_size: str | None = None¶: The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size is 0.

workload_type: str | None = None¶

The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() → dict¶: Serializes the ServedEntityInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedEntityInput¶: Deserializes the ServedEntityInput from a dictionary.

class databricks.sdk.service.serving.ServedEntityOutput¶

creation_timestamp: int | None = None¶: The creation timestamp of the served entity in Unix time.

creator: str | None = None¶: The email of the user who created the served entity.

entity_name: str | None = None¶: The name of the entity served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object is given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None¶: The version of the served entity in Databricks Model Registry or empty if the entity is a FEATURE_SPEC.

environment_vars: Dict[str, str] | None = None¶: An object containing a set of optional, user-specified environment variable key-value pairs used for serving this entity. Note: this is an experimental feature and subject to change. Example entity environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

external_model: ExternalModel | None = None¶: The external model that is served. NOTE: Only one of external_model, foundation_model, and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) is returned based on the endpoint type.

foundation_model: FoundationModel | None = None¶: The foundation model that is served. NOTE: Only one of foundation_model, external_model, and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) is returned based on the endpoint type.

instance_profile_arn: str | None = None¶: ARN of the instance profile that the served entity uses to access AWS resources.

max_provisioned_throughput: int | None = None¶: The maximum tokens per second that the endpoint can scale up to.

min_provisioned_throughput: int | None = None¶: The minimum tokens per second that the endpoint can scale down to.

name: str | None = None¶: The name of the served entity.

scale_to_zero_enabled: bool | None = None¶: Whether the compute resources for the served entity should scale down to zero.

state: ServedModelState | None = None¶: Information corresponding to the state of the served entity.

workload_size: str | None = None¶: The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

workload_type: str | None = None¶

The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() → dict¶: Serializes the ServedEntityOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedEntityOutput¶: Deserializes the ServedEntityOutput from a dictionary.

class databricks.sdk.service.serving.ServedEntitySpec¶

entity_name: str | None = None¶: The name of the entity served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object is given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None¶: The version of the served entity in Databricks Model Registry or empty if the entity is a FEATURE_SPEC.

external_model: ExternalModel | None = None¶: The external model that is served. NOTE: Only one of external_model, foundation_model, and (entity_name, entity_version) is returned based on the endpoint type.

foundation_model: FoundationModel | None = None¶: The foundation model that is served. NOTE: Only one of foundation_model, external_model, and (entity_name, entity_version) is returned based on the endpoint type.

name: str | None = None¶: The name of the served entity.

as_dict() → dict¶: Serializes the ServedEntitySpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedEntitySpec¶: Deserializes the ServedEntitySpec from a dictionary.

class databricks.sdk.service.serving.ServedModelInput¶

model_name: str¶: The name of the model in Databricks Model Registry to be served or if the model resides in Unity Catalog, the full name of model, in the form of __catalog_name__.__schema_name__.__model_name__.

model_version: str¶: The version of the model in Databricks Model Registry or Unity Catalog to be served.

workload_size: ServedModelInputWorkloadSize¶: The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

scale_to_zero_enabled: bool¶: Whether the compute resources for the served model should scale down to zero.

environment_vars: Dict[str, str] | None = None¶: An object containing a set of optional, user-specified environment variable key-value pairs used for serving this model. Note: this is an experimental feature and subject to change. Example model environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

instance_profile_arn: str | None = None¶: ARN of the instance profile that the served model will use to access AWS resources.

name: str | None = None¶: The name of a served model. It must be unique across an endpoint. If not specified, this field will default to <model-name>-<model-version>. A served model name can consist of alphanumeric characters, dashes, and underscores.

workload_type: ServedModelInputWorkloadType | None = None¶

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() → dict¶: Serializes the ServedModelInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedModelInput¶: Deserializes the ServedModelInput from a dictionary.

class databricks.sdk.service.serving.ServedModelInputWorkloadSize¶

The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

LARGE = "LARGE"¶

MEDIUM = "MEDIUM"¶

SMALL = "SMALL"¶

class databricks.sdk.service.serving.ServedModelInputWorkloadType¶

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types]. [GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

CPU = "CPU"¶

GPU_LARGE = "GPU_LARGE"¶

GPU_MEDIUM = "GPU_MEDIUM"¶

GPU_SMALL = "GPU_SMALL"¶

MULTIGPU_MEDIUM = "MULTIGPU_MEDIUM"¶

class databricks.sdk.service.serving.ServedModelOutput¶

creation_timestamp: int | None = None¶: The creation timestamp of the served model in Unix time.

creator: str | None = None¶: The email of the user who created the served model.

environment_vars: Dict[str, str] | None = None¶: An object containing a set of optional, user-specified environment variable key-value pairs used for serving this model. Note: this is an experimental feature and subject to change. Example model environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

instance_profile_arn: str | None = None¶: ARN of the instance profile that the served model will use to access AWS resources.

model_name: str | None = None¶: The name of the model in Databricks Model Registry or the full name of the model in Unity Catalog.

model_version: str | None = None¶: The version of the model in Databricks Model Registry or Unity Catalog to be served.

name: str | None = None¶: The name of the served model.

scale_to_zero_enabled: bool | None = None¶: Whether the compute resources for the Served Model should scale down to zero.

state: ServedModelState | None = None¶: Information corresponding to the state of the Served Model.

workload_size: str | None = None¶: The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

workload_type: str | None = None¶

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() → dict¶: Serializes the ServedModelOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedModelOutput¶: Deserializes the ServedModelOutput from a dictionary.

class databricks.sdk.service.serving.ServedModelSpec¶

model_name: str | None = None¶: The name of the model in Databricks Model Registry or the full name of the model in Unity Catalog.

model_version: str | None = None¶: The version of the model in Databricks Model Registry or Unity Catalog to be served.

name: str | None = None¶: The name of the served model.

as_dict() → dict¶: Serializes the ServedModelSpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedModelSpec¶: Deserializes the ServedModelSpec from a dictionary.

class databricks.sdk.service.serving.ServedModelState¶

deployment: ServedModelStateDeployment | None = None¶: The state of the served entity deployment. DEPLOYMENT_CREATING indicates that the served entity is not ready yet because the deployment is still being created (i.e container image is building, model server is deploying for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the served entity was previously in a ready state but no longer is and is attempting to recover. DEPLOYMENT_READY indicates that the served entity is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was an error trying to bring up the served entity (e.g container image build failed, the model server failed to start due to a model loading error, etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated likely due to a failure in bringing up another served entity under the same endpoint and config version.

deployment_state_message: str | None = None¶: More information about the state of the served entity, if available.

as_dict() → dict¶: Serializes the ServedModelState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServedModelState¶: Deserializes the ServedModelState from a dictionary.

class databricks.sdk.service.serving.ServedModelStateDeployment¶

The state of the served entity deployment. DEPLOYMENT_CREATING indicates that the served entity is not ready yet because the deployment is still being created (i.e container image is building, model server is deploying for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the served entity was previously in a ready state but no longer is and is attempting to recover. DEPLOYMENT_READY indicates that the served entity is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was an error trying to bring up the served entity (e.g container image build failed, the model server failed to start due to a model loading error, etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated likely due to a failure in bringing up another served entity under the same endpoint and config version.

ABORTED = "ABORTED"¶

CREATING = "CREATING"¶

FAILED = "FAILED"¶

READY = "READY"¶

RECOVERING = "RECOVERING"¶

class databricks.sdk.service.serving.ServerLogsResponse¶

logs: str¶: The most recent log lines of the model server processing invocation requests.

as_dict() → dict¶: Serializes the ServerLogsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServerLogsResponse¶: Deserializes the ServerLogsResponse from a dictionary.

class databricks.sdk.service.serving.ServingEndpoint¶

config: EndpointCoreConfigSummary | None = None¶: The config that is currently being served by the endpoint.

creation_timestamp: int | None = None¶: The timestamp when the endpoint was created in Unix time.

creator: str | None = None¶: The email of the user who created the serving endpoint.

id: str | None = None¶: System-generated ID of the endpoint. This is used to refer to the endpoint in the Permissions API

last_updated_timestamp: int | None = None¶: The timestamp when the endpoint was last updated by a user in Unix time.

name: str | None = None¶: The name of the serving endpoint.

state: EndpointState | None = None¶: Information corresponding to the state of the serving endpoint.

tags: List[EndpointTag] | None = None¶: Tags attached to the serving endpoint.

task: str | None = None¶: The task type of the serving endpoint.

as_dict() → dict¶: Serializes the ServingEndpoint into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpoint¶: Deserializes the ServingEndpoint from a dictionary.

class databricks.sdk.service.serving.ServingEndpointAccessControlRequest¶

group_name: str | None = None¶: name of the group

permission_level: ServingEndpointPermissionLevel | None = None¶: Permission level

service_principal_name: str | None = None¶: application ID of a service principal

user_name: str | None = None¶: name of the user

as_dict() → dict¶: Serializes the ServingEndpointAccessControlRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointAccessControlRequest¶: Deserializes the ServingEndpointAccessControlRequest from a dictionary.

class databricks.sdk.service.serving.ServingEndpointAccessControlResponse¶

all_permissions: List[ServingEndpointPermission] | None = None¶: All permissions.

display_name: str | None = None¶: Display name of the user or service principal.

group_name: str | None = None¶: name of the group

service_principal_name: str | None = None¶: Name of the service principal.

user_name: str | None = None¶: name of the user

as_dict() → dict¶: Serializes the ServingEndpointAccessControlResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointAccessControlResponse¶: Deserializes the ServingEndpointAccessControlResponse from a dictionary.

class databricks.sdk.service.serving.ServingEndpointDetailed¶

config: EndpointCoreConfigOutput | None = None¶: The config that is currently being served by the endpoint.

creation_timestamp: int | None = None¶: The timestamp when the endpoint was created in Unix time.

creator: str | None = None¶: The email of the user who created the serving endpoint.

id: str | None = None¶: System-generated ID of the endpoint. This is used to refer to the endpoint in the Permissions API

last_updated_timestamp: int | None = None¶: The timestamp when the endpoint was last updated by a user in Unix time.

name: str | None = None¶: The name of the serving endpoint.

pending_config: EndpointPendingConfig | None = None¶: The config that the endpoint is attempting to update to.

permission_level: ServingEndpointDetailedPermissionLevel | None = None¶: The permission level of the principal making the request.

state: EndpointState | None = None¶: Information corresponding to the state of the serving endpoint.

tags: List[EndpointTag] | None = None¶: Tags attached to the serving endpoint.

task: str | None = None¶: The task type of the serving endpoint.

as_dict() → dict¶: Serializes the ServingEndpointDetailed into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointDetailed¶: Deserializes the ServingEndpointDetailed from a dictionary.

class databricks.sdk.service.serving.ServingEndpointDetailedPermissionLevel¶

The permission level of the principal making the request.

CAN_MANAGE = "CAN_MANAGE"¶

CAN_QUERY = "CAN_QUERY"¶

CAN_VIEW = "CAN_VIEW"¶

class databricks.sdk.service.serving.ServingEndpointPermission¶

inherited: bool | None = None¶

inherited_from_object: List[str] | None = None¶

permission_level: ServingEndpointPermissionLevel | None = None¶: Permission level

as_dict() → dict¶: Serializes the ServingEndpointPermission into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointPermission¶: Deserializes the ServingEndpointPermission from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionLevel¶

Permission level

CAN_MANAGE = "CAN_MANAGE"¶

CAN_QUERY = "CAN_QUERY"¶

CAN_VIEW = "CAN_VIEW"¶

class databricks.sdk.service.serving.ServingEndpointPermissions¶

access_control_list: List[ServingEndpointAccessControlResponse] | None = None¶

object_id: str | None = None¶

object_type: str | None = None¶

as_dict() → dict¶: Serializes the ServingEndpointPermissions into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointPermissions¶: Deserializes the ServingEndpointPermissions from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionsDescription¶

description: str | None = None¶

permission_level: ServingEndpointPermissionLevel | None = None¶: Permission level

as_dict() → dict¶: Serializes the ServingEndpointPermissionsDescription into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointPermissionsDescription¶: Deserializes the ServingEndpointPermissionsDescription from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionsRequest¶

access_control_list: List[ServingEndpointAccessControlRequest] | None = None¶

serving_endpoint_id: str | None = None¶: The serving endpoint for which to get or manage permissions.

as_dict() → dict¶: Serializes the ServingEndpointPermissionsRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ServingEndpointPermissionsRequest¶: Deserializes the ServingEndpointPermissionsRequest from a dictionary.

class databricks.sdk.service.serving.StopAppRequest¶

name: str | None = None¶: The name of the app.

class databricks.sdk.service.serving.StopAppResponse¶

as_dict() → dict¶: Serializes the StopAppResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → StopAppResponse¶: Deserializes the StopAppResponse from a dictionary.

class databricks.sdk.service.serving.TrafficConfig¶

routes: List[Route] | None = None¶: The list of routes that define traffic to each served entity.

as_dict() → dict¶: Serializes the TrafficConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → TrafficConfig¶: Deserializes the TrafficConfig from a dictionary.

class databricks.sdk.service.serving.UpdateAppRequest¶

name: str¶: The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

description: str | None = None¶: The description of the app.

as_dict() → dict¶: Serializes the UpdateAppRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → UpdateAppRequest¶: Deserializes the UpdateAppRequest from a dictionary.

class databricks.sdk.service.serving.V1ResponseChoiceElement¶

finish_reason: str | None = None¶: The finish reason returned by the endpoint.

index: int | None = None¶: The index of the choice in the __chat or completions__ response.

logprobs: int | None = None¶: The logprobs returned only by the __completions__ endpoint.

message: ChatMessage | None = None¶: The message response from the __chat__ endpoint.

text: str | None = None¶: The text response from the __completions__ endpoint.

as_dict() → dict¶: Serializes the V1ResponseChoiceElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → V1ResponseChoiceElement¶: Deserializes the V1ResponseChoiceElement from a dictionary.

Navigation

Related Topics

Real-time Serving¶