Real-time Serving

These dataclasses are used in the SDK to represent API requests and responses for services in the databricks.sdk.service.serving module.

class databricks.sdk.service.serving.Ai21LabsConfig
ai21labs_api_key: str

The Databricks secret key reference for an AI21Labs API key.

as_dict() dict

Serializes the Ai21LabsConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Ai21LabsConfig

Deserializes the Ai21LabsConfig from a dictionary.

class databricks.sdk.service.serving.AmazonBedrockConfig
aws_region: str

The AWS region to use. Bedrock has to be enabled there.

aws_access_key_id: str

The Databricks secret key reference for an AWS Access Key ID with permissions to interact with Bedrock services.

aws_secret_access_key: str

The Databricks secret key reference for an AWS Secret Access Key paired with the access key ID, with permissions to interact with Bedrock services.

bedrock_provider: AmazonBedrockConfigBedrockProvider

The underlying provider in Amazon Bedrock. Supported values (case insensitive) include: Anthropic, Cohere, AI21Labs, Amazon.

as_dict() dict

Serializes the AmazonBedrockConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AmazonBedrockConfig

Deserializes the AmazonBedrockConfig from a dictionary.

class databricks.sdk.service.serving.AmazonBedrockConfigBedrockProvider

The underlying provider in Amazon Bedrock. Supported values (case insensitive) include: Anthropic, Cohere, AI21Labs, Amazon.

AI21LABS = "AI21LABS"
AMAZON = "AMAZON"
ANTHROPIC = "ANTHROPIC"
COHERE = "COHERE"
class databricks.sdk.service.serving.AnthropicConfig
anthropic_api_key: str

The Databricks secret key reference for an Anthropic API key.

as_dict() dict

Serializes the AnthropicConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AnthropicConfig

Deserializes the AnthropicConfig from a dictionary.

class databricks.sdk.service.serving.App
name: str

The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

active_deployment: AppDeployment | None = None

The active deployment of the app.

create_time: str | None = None

The creation time of the app. Formatted timestamp in ISO 6801.

creator: str | None = None

The email of the user that created the app.

description: str | None = None

The description of the app.

pending_deployment: AppDeployment | None = None

The pending deployment of the app.

status: AppStatus | None = None
update_time: str | None = None

The update time of the app. Formatted timestamp in ISO 6801.

updater: str | None = None

The email of the user that last updated the app.

url: str | None = None

The URL of the app once it is deployed.

as_dict() dict

Serializes the App into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) App

Deserializes the App from a dictionary.

class databricks.sdk.service.serving.AppDeployment
source_code_path: str

The source code path of the deployment.

create_time: str | None = None

The creation time of the deployment. Formatted timestamp in ISO 6801.

creator: str | None = None

The email of the user creates the deployment.

deployment_id: str | None = None

The unique id of the deployment.

status: AppDeploymentStatus | None = None

Status and status message of the deployment

update_time: str | None = None

The update time of the deployment. Formatted timestamp in ISO 6801.

as_dict() dict

Serializes the AppDeployment into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AppDeployment

Deserializes the AppDeployment from a dictionary.

class databricks.sdk.service.serving.AppDeploymentState
CANCELLED = "CANCELLED"
FAILED = "FAILED"
IN_PROGRESS = "IN_PROGRESS"
STATE_UNSPECIFIED = "STATE_UNSPECIFIED"
SUCCEEDED = "SUCCEEDED"
class databricks.sdk.service.serving.AppDeploymentStatus
message: str | None = None

Message corresponding with the deployment state.

state: AppDeploymentState | None = None

State of the deployment.

as_dict() dict

Serializes the AppDeploymentStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AppDeploymentStatus

Deserializes the AppDeploymentStatus from a dictionary.

class databricks.sdk.service.serving.AppEnvironment
env: List[EnvVariable] | None = None
as_dict() dict

Serializes the AppEnvironment into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AppEnvironment

Deserializes the AppEnvironment from a dictionary.

class databricks.sdk.service.serving.AppState
CREATING = "CREATING"
DELETED = "DELETED"
DELETING = "DELETING"
DEPLOYED = "DEPLOYED"
DEPLOYING = "DEPLOYING"
ERROR = "ERROR"
IDLE = "IDLE"
READY = "READY"
RUNNING = "RUNNING"
STARTING = "STARTING"
STATE_UNSPECIFIED = "STATE_UNSPECIFIED"
UPDATING = "UPDATING"
class databricks.sdk.service.serving.AppStatus
message: str | None = None

Message corresponding with the app state.

state: AppState | None = None

State of the app.

as_dict() dict

Serializes the AppStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AppStatus

Deserializes the AppStatus from a dictionary.

class databricks.sdk.service.serving.AutoCaptureConfigInput
catalog_name: str | None = None

The name of the catalog in Unity Catalog. NOTE: On update, you cannot change the catalog name if it was already set.

enabled: bool | None = None

If inference tables are enabled or not. NOTE: If you have already disabled payload logging once, you cannot enable again.

schema_name: str | None = None

The name of the schema in Unity Catalog. NOTE: On update, you cannot change the schema name if it was already set.

table_name_prefix: str | None = None

The prefix of the table in Unity Catalog. NOTE: On update, you cannot change the prefix name if it was already set.

as_dict() dict

Serializes the AutoCaptureConfigInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AutoCaptureConfigInput

Deserializes the AutoCaptureConfigInput from a dictionary.

class databricks.sdk.service.serving.AutoCaptureConfigOutput
catalog_name: str | None = None

The name of the catalog in Unity Catalog.

enabled: bool | None = None

If inference tables are enabled or not.

schema_name: str | None = None

The name of the schema in Unity Catalog.

state: AutoCaptureState | None = None
table_name_prefix: str | None = None

The prefix of the table in Unity Catalog.

as_dict() dict

Serializes the AutoCaptureConfigOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AutoCaptureConfigOutput

Deserializes the AutoCaptureConfigOutput from a dictionary.

class databricks.sdk.service.serving.AutoCaptureState
payload_table: PayloadTable | None = None
as_dict() dict

Serializes the AutoCaptureState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AutoCaptureState

Deserializes the AutoCaptureState from a dictionary.

class databricks.sdk.service.serving.BuildLogsResponse
logs: str

The logs associated with building the served entity’s environment.

as_dict() dict

Serializes the BuildLogsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) BuildLogsResponse

Deserializes the BuildLogsResponse from a dictionary.

class databricks.sdk.service.serving.ChatMessage
content: str | None = None

The content of the message.

role: ChatMessageRole | None = None

The role of the message. One of [system, user, assistant].

as_dict() dict

Serializes the ChatMessage into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ChatMessage

Deserializes the ChatMessage from a dictionary.

class databricks.sdk.service.serving.ChatMessageRole

The role of the message. One of [system, user, assistant].

ASSISTANT = "ASSISTANT"
SYSTEM = "SYSTEM"
USER = "USER"
class databricks.sdk.service.serving.CohereConfig
cohere_api_key: str

The Databricks secret key reference for a Cohere API key.

as_dict() dict

Serializes the CohereConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CohereConfig

Deserializes the CohereConfig from a dictionary.

class databricks.sdk.service.serving.CreateAppDeploymentRequest
source_code_path: str

The source code path of the deployment.

app_name: str | None = None

The name of the app.

as_dict() dict

Serializes the CreateAppDeploymentRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateAppDeploymentRequest

Deserializes the CreateAppDeploymentRequest from a dictionary.

class databricks.sdk.service.serving.CreateAppRequest
name: str

The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

description: str | None = None

The description of the app.

as_dict() dict

Serializes the CreateAppRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateAppRequest

Deserializes the CreateAppRequest from a dictionary.

class databricks.sdk.service.serving.CreateServingEndpoint
name: str

The name of the serving endpoint. This field is required and must be unique across a Databricks workspace. An endpoint name can consist of alphanumeric characters, dashes, and underscores.

config: EndpointCoreConfigInput

The core config of the serving endpoint.

rate_limits: List[RateLimit] | None = None

Rate limits to be applied to the serving endpoint. NOTE: only external and foundation model endpoints are supported as of now.

tags: List[EndpointTag] | None = None

Tags to be attached to the serving endpoint and automatically propagated to billing logs.

as_dict() dict

Serializes the CreateServingEndpoint into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateServingEndpoint

Deserializes the CreateServingEndpoint from a dictionary.

class databricks.sdk.service.serving.DatabricksModelServingConfig
databricks_api_token: str

The Databricks secret key reference for a Databricks API token that corresponds to a user or service principal with Can Query access to the model serving endpoint pointed to by this external model.

databricks_workspace_url: str

The URL of the Databricks workspace containing the model serving endpoint pointed to by this external model.

as_dict() dict

Serializes the DatabricksModelServingConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DatabricksModelServingConfig

Deserializes the DatabricksModelServingConfig from a dictionary.

class databricks.sdk.service.serving.DataframeSplitInput
columns: List[Any] | None = None
data: List[Any] | None = None
index: List[int] | None = None
as_dict() dict

Serializes the DataframeSplitInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DataframeSplitInput

Deserializes the DataframeSplitInput from a dictionary.

class databricks.sdk.service.serving.DeleteResponse
as_dict() dict

Serializes the DeleteResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteResponse

Deserializes the DeleteResponse from a dictionary.

class databricks.sdk.service.serving.EmbeddingsV1ResponseEmbeddingElement
embedding: List[float] | None = None
index: int | None = None

The index of the embedding in the response.

object: EmbeddingsV1ResponseEmbeddingElementObject | None = None

This will always be ‘embedding’.

as_dict() dict

Serializes the EmbeddingsV1ResponseEmbeddingElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EmbeddingsV1ResponseEmbeddingElement

Deserializes the EmbeddingsV1ResponseEmbeddingElement from a dictionary.

class databricks.sdk.service.serving.EmbeddingsV1ResponseEmbeddingElementObject

This will always be ‘embedding’.

EMBEDDING = "EMBEDDING"
class databricks.sdk.service.serving.EndpointCoreConfigInput
auto_capture_config: AutoCaptureConfigInput | None = None

Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

name: str | None = None

The name of the serving endpoint to update. This field is required.

served_entities: List[ServedEntityInput] | None = None

A list of served entities for the endpoint to serve. A serving endpoint can have up to 15 served entities.

served_models: List[ServedModelInput] | None = None

(Deprecated, use served_entities instead) A list of served models for the endpoint to serve. A serving endpoint can have up to 15 served models.

traffic_config: TrafficConfig | None = None

The traffic config defining how invocations to the serving endpoint should be routed.

as_dict() dict

Serializes the EndpointCoreConfigInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointCoreConfigInput

Deserializes the EndpointCoreConfigInput from a dictionary.

class databricks.sdk.service.serving.EndpointCoreConfigOutput
auto_capture_config: AutoCaptureConfigOutput | None = None

Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

config_version: int | None = None

The config version that the serving endpoint is currently serving.

served_entities: List[ServedEntityOutput] | None = None

The list of served entities under the serving endpoint config.

served_models: List[ServedModelOutput] | None = None

(Deprecated, use served_entities instead) The list of served models under the serving endpoint config.

traffic_config: TrafficConfig | None = None

The traffic configuration associated with the serving endpoint config.

as_dict() dict

Serializes the EndpointCoreConfigOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointCoreConfigOutput

Deserializes the EndpointCoreConfigOutput from a dictionary.

class databricks.sdk.service.serving.EndpointCoreConfigSummary
served_entities: List[ServedEntitySpec] | None = None

The list of served entities under the serving endpoint config.

served_models: List[ServedModelSpec] | None = None

(Deprecated, use served_entities instead) The list of served models under the serving endpoint config.

as_dict() dict

Serializes the EndpointCoreConfigSummary into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointCoreConfigSummary

Deserializes the EndpointCoreConfigSummary from a dictionary.

class databricks.sdk.service.serving.EndpointPendingConfig
auto_capture_config: AutoCaptureConfigOutput | None = None

Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

config_version: int | None = None

The config version that the serving endpoint is currently serving.

served_entities: List[ServedEntityOutput] | None = None

The list of served entities belonging to the last issued update to the serving endpoint.

served_models: List[ServedModelOutput] | None = None

(Deprecated, use served_entities instead) The list of served models belonging to the last issued update to the serving endpoint.

start_time: int | None = None

The timestamp when the update to the pending config started.

traffic_config: TrafficConfig | None = None

The traffic config defining how invocations to the serving endpoint should be routed.

as_dict() dict

Serializes the EndpointPendingConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointPendingConfig

Deserializes the EndpointPendingConfig from a dictionary.

class databricks.sdk.service.serving.EndpointState
config_update: EndpointStateConfigUpdate | None = None

The state of an endpoint’s config update. This informs the user if the pending_config is in progress, if the update failed, or if there is no update in progress. Note that if the endpoint’s config_update state value is IN_PROGRESS, another update can not be made until the update completes or fails.

ready: EndpointStateReady | None = None

The state of an endpoint, indicating whether or not the endpoint is queryable. An endpoint is READY if all of the served entities in its active configuration are ready. If any of the actively served entities are in a non-ready state, the endpoint state will be NOT_READY.

as_dict() dict

Serializes the EndpointState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointState

Deserializes the EndpointState from a dictionary.

class databricks.sdk.service.serving.EndpointStateConfigUpdate

The state of an endpoint’s config update. This informs the user if the pending_config is in progress, if the update failed, or if there is no update in progress. Note that if the endpoint’s config_update state value is IN_PROGRESS, another update can not be made until the update completes or fails.

IN_PROGRESS = "IN_PROGRESS"
NOT_UPDATING = "NOT_UPDATING"
UPDATE_FAILED = "UPDATE_FAILED"
class databricks.sdk.service.serving.EndpointStateReady

The state of an endpoint, indicating whether or not the endpoint is queryable. An endpoint is READY if all of the served entities in its active configuration are ready. If any of the actively served entities are in a non-ready state, the endpoint state will be NOT_READY.

NOT_READY = "NOT_READY"
READY = "READY"
class databricks.sdk.service.serving.EndpointTag
key: str

Key field for a serving endpoint tag.

value: str | None = None

Optional value field for a serving endpoint tag.

as_dict() dict

Serializes the EndpointTag into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EndpointTag

Deserializes the EndpointTag from a dictionary.

class databricks.sdk.service.serving.EnvVariable
name: str | None = None
value: str | None = None
value_from: str | None = None
as_dict() dict

Serializes the EnvVariable into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EnvVariable

Deserializes the EnvVariable from a dictionary.

class databricks.sdk.service.serving.ExportMetricsResponse
as_dict() dict

Serializes the ExportMetricsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ExportMetricsResponse

Deserializes the ExportMetricsResponse from a dictionary.

class databricks.sdk.service.serving.ExternalModel
provider: ExternalModelProvider

The name of the provider for the external model. Currently, the supported providers are ‘ai21labs’, ‘anthropic’, ‘amazon-bedrock’, ‘cohere’, ‘databricks-model-serving’, ‘openai’, and ‘palm’.”,

name: str

The name of the external model.

task: str

The task type of the external model.

ai21labs_config: Ai21LabsConfig | None = None

AI21Labs Config. Only required if the provider is ‘ai21labs’.

amazon_bedrock_config: AmazonBedrockConfig | None = None

Amazon Bedrock Config. Only required if the provider is ‘amazon-bedrock’.

anthropic_config: AnthropicConfig | None = None

Anthropic Config. Only required if the provider is ‘anthropic’.

cohere_config: CohereConfig | None = None

Cohere Config. Only required if the provider is ‘cohere’.

databricks_model_serving_config: DatabricksModelServingConfig | None = None

Databricks Model Serving Config. Only required if the provider is ‘databricks-model-serving’.

openai_config: OpenAiConfig | None = None

OpenAI Config. Only required if the provider is ‘openai’.

palm_config: PaLmConfig | None = None

PaLM Config. Only required if the provider is ‘palm’.

as_dict() dict

Serializes the ExternalModel into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ExternalModel

Deserializes the ExternalModel from a dictionary.

class databricks.sdk.service.serving.ExternalModelProvider

The name of the provider for the external model. Currently, the supported providers are ‘ai21labs’, ‘anthropic’, ‘amazon-bedrock’, ‘cohere’, ‘databricks-model-serving’, ‘openai’, and ‘palm’.”,

AI21LABS = "AI21LABS"
AMAZON_BEDROCK = "AMAZON_BEDROCK"
ANTHROPIC = "ANTHROPIC"
COHERE = "COHERE"
DATABRICKS_MODEL_SERVING = "DATABRICKS_MODEL_SERVING"
OPENAI = "OPENAI"
PALM = "PALM"
class databricks.sdk.service.serving.ExternalModelUsageElement
completion_tokens: int | None = None

The number of tokens in the chat/completions response.

prompt_tokens: int | None = None

The number of tokens in the prompt.

total_tokens: int | None = None

The total number of tokens in the prompt and response.

as_dict() dict

Serializes the ExternalModelUsageElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ExternalModelUsageElement

Deserializes the ExternalModelUsageElement from a dictionary.

class databricks.sdk.service.serving.FoundationModel
description: str | None = None

The description of the foundation model.

display_name: str | None = None

The display name of the foundation model.

docs: str | None = None

The URL to the documentation of the foundation model.

name: str | None = None

The name of the foundation model.

as_dict() dict

Serializes the FoundationModel into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) FoundationModel

Deserializes the FoundationModel from a dictionary.

class databricks.sdk.service.serving.GetOpenApiResponse

The response is an OpenAPI spec in JSON format that typically includes fields like openapi, info, servers and paths, etc.

as_dict() dict

Serializes the GetOpenApiResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetOpenApiResponse

Deserializes the GetOpenApiResponse from a dictionary.

class databricks.sdk.service.serving.GetServingEndpointPermissionLevelsResponse
permission_levels: List[ServingEndpointPermissionsDescription] | None = None

Specific permission levels

as_dict() dict

Serializes the GetServingEndpointPermissionLevelsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetServingEndpointPermissionLevelsResponse

Deserializes the GetServingEndpointPermissionLevelsResponse from a dictionary.

class databricks.sdk.service.serving.ListAppDeploymentsResponse
app_deployments: List[AppDeployment] | None = None

Deployment history of the app.

next_page_token: str | None = None

Pagination token to request the next page of apps.

as_dict() dict

Serializes the ListAppDeploymentsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListAppDeploymentsResponse

Deserializes the ListAppDeploymentsResponse from a dictionary.

class databricks.sdk.service.serving.ListAppsResponse
apps: List[App] | None = None
next_page_token: str | None = None

Pagination token to request the next page of apps.

as_dict() dict

Serializes the ListAppsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListAppsResponse

Deserializes the ListAppsResponse from a dictionary.

class databricks.sdk.service.serving.ListEndpointsResponse
endpoints: List[ServingEndpoint] | None = None

The list of endpoints.

as_dict() dict

Serializes the ListEndpointsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListEndpointsResponse

Deserializes the ListEndpointsResponse from a dictionary.

class databricks.sdk.service.serving.OpenAiConfig
openai_api_key: str

The Databricks secret key reference for an OpenAI or Azure OpenAI API key.

openai_api_base: str | None = None

This is the base URL for the OpenAI API (default: “https://api.openai.com/v1”). For Azure OpenAI, this field is required, and is the base URL for the Azure OpenAI API service provided by Azure.

openai_api_type: str | None = None

This is an optional field to specify the type of OpenAI API to use. For Azure OpenAI, this field is required, and adjust this parameter to represent the preferred security access validation protocol. For access token validation, use azure. For authentication using Azure Active Directory (Azure AD) use, azuread.

openai_api_version: str | None = None

This is an optional field to specify the OpenAI API version. For Azure OpenAI, this field is required, and is the version of the Azure OpenAI service to utilize, specified by a date.

openai_deployment_name: str | None = None

This field is only required for Azure OpenAI and is the name of the deployment resource for the Azure OpenAI service.

openai_organization: str | None = None

This is an optional field to specify the organization in OpenAI or Azure OpenAI.

as_dict() dict

Serializes the OpenAiConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) OpenAiConfig

Deserializes the OpenAiConfig from a dictionary.

class databricks.sdk.service.serving.PaLmConfig
palm_api_key: str

The Databricks secret key reference for a PaLM API key.

as_dict() dict

Serializes the PaLmConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PaLmConfig

Deserializes the PaLmConfig from a dictionary.

class databricks.sdk.service.serving.PatchServingEndpointTags
add_tags: List[EndpointTag] | None = None

List of endpoint tags to add

delete_tags: List[str] | None = None

List of tag keys to delete

name: str | None = None

The name of the serving endpoint who’s tags to patch. This field is required.

as_dict() dict

Serializes the PatchServingEndpointTags into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PatchServingEndpointTags

Deserializes the PatchServingEndpointTags from a dictionary.

class databricks.sdk.service.serving.PayloadTable
name: str | None = None

The name of the payload table.

status: str | None = None

The status of the payload table.

status_message: str | None = None

The status message of the payload table.

as_dict() dict

Serializes the PayloadTable into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PayloadTable

Deserializes the PayloadTable from a dictionary.

class databricks.sdk.service.serving.PutResponse
rate_limits: List[RateLimit] | None = None

The list of endpoint rate limits.

as_dict() dict

Serializes the PutResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PutResponse

Deserializes the PutResponse from a dictionary.

class databricks.sdk.service.serving.QueryEndpointInput
dataframe_records: List[Any] | None = None

Pandas Dataframe input in the records orientation.

dataframe_split: DataframeSplitInput | None = None

Pandas Dataframe input in the split orientation.

extra_params: Dict[str, str] | None = None

The extra parameters field used ONLY for __completions, chat,__ and __embeddings external & foundation model__ serving endpoints. This is a map of strings and should only be used with other external/foundation model query fields.

input: Any | None = None

The input string (or array of strings) field used ONLY for __embeddings external & foundation model__ serving endpoints and is the only field (along with extra_params if needed) used by embeddings queries.

inputs: Any | None = None

Tensor-based input in columnar format.

instances: List[Any] | None = None

Tensor-based input in row format.

max_tokens: int | None = None

The max tokens field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is an integer and should only be used with other chat/completions query fields.

messages: List[ChatMessage] | None = None

The messages field used ONLY for __chat external & foundation model__ serving endpoints. This is a map of strings and should only be used with other chat query fields.

n: int | None = None

The n (number of candidates) field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is an integer between 1 and 5 with a default of 1 and should only be used with other chat/completions query fields.

name: str | None = None

The name of the serving endpoint. This field is required.

prompt: Any | None = None

The prompt string (or array of strings) field used ONLY for __completions external & foundation model__ serving endpoints and should only be used with other completions query fields.

stop: List[str] | None = None

The stop sequences field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a list of strings and should only be used with other chat/completions query fields.

stream: bool | None = None

The stream field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a boolean defaulting to false and should only be used with other chat/completions query fields.

temperature: float | None = None

The temperature field used ONLY for __completions__ and __chat external & foundation model__ serving endpoints. This is a float between 0.0 and 2.0 with a default of 1.0 and should only be used with other chat/completions query fields.

as_dict() dict

Serializes the QueryEndpointInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) QueryEndpointInput

Deserializes the QueryEndpointInput from a dictionary.

class databricks.sdk.service.serving.QueryEndpointResponse
choices: List[V1ResponseChoiceElement] | None = None

The list of choices returned by the __chat or completions external/foundation model__ serving endpoint.

created: int | None = None

The timestamp in seconds when the query was created in Unix time returned by a __completions or chat external/foundation model__ serving endpoint.

data: List[EmbeddingsV1ResponseEmbeddingElement] | None = None

The list of the embeddings returned by the __embeddings external/foundation model__ serving endpoint.

id: str | None = None

The ID of the query that may be returned by a __completions or chat external/foundation model__ serving endpoint.

model: str | None = None

The name of the __external/foundation model__ used for querying. This is the name of the model that was specified in the endpoint config.

object: QueryEndpointResponseObject | None = None

The type of object returned by the __external/foundation model__ serving endpoint, one of [text_completion, chat.completion, list (of embeddings)].

predictions: List[Any] | None = None

The predictions returned by the serving endpoint.

served_model_name: str | None = None

The name of the served model that served the request. This is useful when there are multiple models behind the same endpoint with traffic split.

usage: ExternalModelUsageElement | None = None

The usage object that may be returned by the __external/foundation model__ serving endpoint. This contains information about the number of tokens used in the prompt and response.

as_dict() dict

Serializes the QueryEndpointResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) QueryEndpointResponse

Deserializes the QueryEndpointResponse from a dictionary.

class databricks.sdk.service.serving.QueryEndpointResponseObject

The type of object returned by the __external/foundation model__ serving endpoint, one of [text_completion, chat.completion, list (of embeddings)].

CHAT_COMPLETION = "CHAT_COMPLETION"
LIST = "LIST"
TEXT_COMPLETION = "TEXT_COMPLETION"
class databricks.sdk.service.serving.RateLimit
calls: int

Used to specify how many calls are allowed for a key within the renewal_period.

renewal_period: RateLimitRenewalPeriod

Renewal period field for a serving endpoint rate limit. Currently, only ‘minute’ is supported.

key: RateLimitKey | None = None

Key field for a serving endpoint rate limit. Currently, only ‘user’ and ‘endpoint’ are supported, with ‘endpoint’ being the default if not specified.

as_dict() dict

Serializes the RateLimit into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RateLimit

Deserializes the RateLimit from a dictionary.

class databricks.sdk.service.serving.RateLimitKey

Key field for a serving endpoint rate limit. Currently, only ‘user’ and ‘endpoint’ are supported, with ‘endpoint’ being the default if not specified.

ENDPOINT = "ENDPOINT"
USER = "USER"
class databricks.sdk.service.serving.RateLimitRenewalPeriod

Renewal period field for a serving endpoint rate limit. Currently, only ‘minute’ is supported.

MINUTE = "MINUTE"
class databricks.sdk.service.serving.Route
served_model_name: str

The name of the served model this route configures traffic for.

traffic_percentage: int

The percentage of endpoint traffic to send to this route. It must be an integer between 0 and 100 inclusive.

as_dict() dict

Serializes the Route into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Route

Deserializes the Route from a dictionary.

class databricks.sdk.service.serving.ServedEntityInput
entity_name: str | None = None

The name of the entity to be served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object should be given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None

The version of the model in Databricks Model Registry to be served or empty if the entity is a FEATURE_SPEC.

environment_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs used for serving this entity. Note: this is an experimental feature and subject to change. Example entity environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

external_model: ExternalModel | None = None

The external model to be served. NOTE: Only one of external_model and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) can be specified with the latter set being used for custom model serving for a Databricks registered model. When an external_model is present, the served entities list can only have one served_entity object. For an existing endpoint with external_model, it can not be updated to an endpoint without external_model. If the endpoint is created without external_model, users cannot update it to add external_model later.

instance_profile_arn: str | None = None

ARN of the instance profile that the served entity uses to access AWS resources.

max_provisioned_throughput: int | None = None

The maximum tokens per second that the endpoint can scale up to.

min_provisioned_throughput: int | None = None

The minimum tokens per second that the endpoint can scale down to.

name: str | None = None

The name of a served entity. It must be unique across an endpoint. A served entity name can consist of alphanumeric characters, dashes, and underscores. If not specified for an external model, this field defaults to external_model.name, with ‘.’ and ‘:’ replaced with ‘-’, and if not specified for other entities, it defaults to <entity-name>-<entity-version>.

scale_to_zero_enabled: bool | None = None

Whether the compute resources for the served entity should scale down to zero.

workload_size: str | None = None

The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size is 0.

workload_type: str | None = None

The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() dict

Serializes the ServedEntityInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedEntityInput

Deserializes the ServedEntityInput from a dictionary.

class databricks.sdk.service.serving.ServedEntityOutput
creation_timestamp: int | None = None

The creation timestamp of the served entity in Unix time.

creator: str | None = None

The email of the user who created the served entity.

entity_name: str | None = None

The name of the entity served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object is given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None

The version of the served entity in Databricks Model Registry or empty if the entity is a FEATURE_SPEC.

environment_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs used for serving this entity. Note: this is an experimental feature and subject to change. Example entity environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

external_model: ExternalModel | None = None

The external model that is served. NOTE: Only one of external_model, foundation_model, and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) is returned based on the endpoint type.

foundation_model: FoundationModel | None = None

The foundation model that is served. NOTE: Only one of foundation_model, external_model, and (entity_name, entity_version, workload_size, workload_type, and scale_to_zero_enabled) is returned based on the endpoint type.

instance_profile_arn: str | None = None

ARN of the instance profile that the served entity uses to access AWS resources.

max_provisioned_throughput: int | None = None

The maximum tokens per second that the endpoint can scale up to.

min_provisioned_throughput: int | None = None

The minimum tokens per second that the endpoint can scale down to.

name: str | None = None

The name of the served entity.

scale_to_zero_enabled: bool | None = None

Whether the compute resources for the served entity should scale down to zero.

state: ServedModelState | None = None

Information corresponding to the state of the served entity.

workload_size: str | None = None

The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

workload_type: str | None = None

The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() dict

Serializes the ServedEntityOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedEntityOutput

Deserializes the ServedEntityOutput from a dictionary.

class databricks.sdk.service.serving.ServedEntitySpec
entity_name: str | None = None

The name of the entity served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC), or a function of type FEATURE_SPEC in the UC. If it is a UC object, the full name of the object is given in the form of __catalog_name__.__schema_name__.__model_name__.

entity_version: str | None = None

The version of the served entity in Databricks Model Registry or empty if the entity is a FEATURE_SPEC.

external_model: ExternalModel | None = None

The external model that is served. NOTE: Only one of external_model, foundation_model, and (entity_name, entity_version) is returned based on the endpoint type.

foundation_model: FoundationModel | None = None

The foundation model that is served. NOTE: Only one of foundation_model, external_model, and (entity_name, entity_version) is returned based on the endpoint type.

name: str | None = None

The name of the served entity.

as_dict() dict

Serializes the ServedEntitySpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedEntitySpec

Deserializes the ServedEntitySpec from a dictionary.

class databricks.sdk.service.serving.ServedModelInput
model_name: str

The name of the model in Databricks Model Registry to be served or if the model resides in Unity Catalog, the full name of model, in the form of __catalog_name__.__schema_name__.__model_name__.

model_version: str

The version of the model in Databricks Model Registry or Unity Catalog to be served.

workload_size: ServedModelInputWorkloadSize

The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

scale_to_zero_enabled: bool

Whether the compute resources for the served model should scale down to zero.

environment_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs used for serving this model. Note: this is an experimental feature and subject to change. Example model environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

instance_profile_arn: str | None = None

ARN of the instance profile that the served model will use to access AWS resources.

name: str | None = None

The name of a served model. It must be unique across an endpoint. If not specified, this field will default to <model-name>-<model-version>. A served model name can consist of alphanumeric characters, dashes, and underscores.

workload_type: ServedModelInputWorkloadType | None = None

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() dict

Serializes the ServedModelInput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedModelInput

Deserializes the ServedModelInput from a dictionary.

class databricks.sdk.service.serving.ServedModelInputWorkloadSize

The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

LARGE = "LARGE"
MEDIUM = "MEDIUM"
SMALL = "SMALL"
class databricks.sdk.service.serving.ServedModelInputWorkloadType

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types]. [GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

CPU = "CPU"
GPU_LARGE = "GPU_LARGE"
GPU_MEDIUM = "GPU_MEDIUM"
GPU_SMALL = "GPU_SMALL"
MULTIGPU_MEDIUM = "MULTIGPU_MEDIUM"
class databricks.sdk.service.serving.ServedModelOutput
creation_timestamp: int | None = None

The creation timestamp of the served model in Unix time.

creator: str | None = None

The email of the user who created the served model.

environment_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs used for serving this model. Note: this is an experimental feature and subject to change. Example model environment variables that refer to Databricks secrets: {“OPENAI_API_KEY”: “{{secrets/my_scope/my_key}}”, “DATABRICKS_TOKEN”: “{{secrets/my_scope2/my_key2}}”}

instance_profile_arn: str | None = None

ARN of the instance profile that the served model will use to access AWS resources.

model_name: str | None = None

The name of the model in Databricks Model Registry or the full name of the model in Unity Catalog.

model_version: str | None = None

The version of the model in Databricks Model Registry or Unity Catalog to be served.

name: str | None = None

The name of the served model.

scale_to_zero_enabled: bool | None = None

Whether the compute resources for the Served Model should scale down to zero.

state: ServedModelState | None = None

Information corresponding to the state of the Served Model.

workload_size: str | None = None

The workload size of the served model. The workload size corresponds to a range of provisioned concurrency that the compute will autoscale between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are “Small” (4 - 4 provisioned concurrency), “Medium” (8 - 16 provisioned concurrency), and “Large” (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.

workload_type: str | None = None

The workload type of the served model. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is “CPU”. For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. See the available [GPU types].

[GPU types]: https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html#gpu-workload-types

as_dict() dict

Serializes the ServedModelOutput into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedModelOutput

Deserializes the ServedModelOutput from a dictionary.

class databricks.sdk.service.serving.ServedModelSpec
model_name: str | None = None

The name of the model in Databricks Model Registry or the full name of the model in Unity Catalog.

model_version: str | None = None

The version of the model in Databricks Model Registry or Unity Catalog to be served.

name: str | None = None

The name of the served model.

as_dict() dict

Serializes the ServedModelSpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedModelSpec

Deserializes the ServedModelSpec from a dictionary.

class databricks.sdk.service.serving.ServedModelState
deployment: ServedModelStateDeployment | None = None

The state of the served entity deployment. DEPLOYMENT_CREATING indicates that the served entity is not ready yet because the deployment is still being created (i.e container image is building, model server is deploying for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the served entity was previously in a ready state but no longer is and is attempting to recover. DEPLOYMENT_READY indicates that the served entity is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was an error trying to bring up the served entity (e.g container image build failed, the model server failed to start due to a model loading error, etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated likely due to a failure in bringing up another served entity under the same endpoint and config version.

deployment_state_message: str | None = None

More information about the state of the served entity, if available.

as_dict() dict

Serializes the ServedModelState into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServedModelState

Deserializes the ServedModelState from a dictionary.

class databricks.sdk.service.serving.ServedModelStateDeployment

The state of the served entity deployment. DEPLOYMENT_CREATING indicates that the served entity is not ready yet because the deployment is still being created (i.e container image is building, model server is deploying for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the served entity was previously in a ready state but no longer is and is attempting to recover. DEPLOYMENT_READY indicates that the served entity is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was an error trying to bring up the served entity (e.g container image build failed, the model server failed to start due to a model loading error, etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated likely due to a failure in bringing up another served entity under the same endpoint and config version.

ABORTED = "ABORTED"
CREATING = "CREATING"
FAILED = "FAILED"
READY = "READY"
RECOVERING = "RECOVERING"
class databricks.sdk.service.serving.ServerLogsResponse
logs: str

The most recent log lines of the model server processing invocation requests.

as_dict() dict

Serializes the ServerLogsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServerLogsResponse

Deserializes the ServerLogsResponse from a dictionary.

class databricks.sdk.service.serving.ServingEndpoint
config: EndpointCoreConfigSummary | None = None

The config that is currently being served by the endpoint.

creation_timestamp: int | None = None

The timestamp when the endpoint was created in Unix time.

creator: str | None = None

The email of the user who created the serving endpoint.

id: str | None = None

System-generated ID of the endpoint. This is used to refer to the endpoint in the Permissions API

last_updated_timestamp: int | None = None

The timestamp when the endpoint was last updated by a user in Unix time.

name: str | None = None

The name of the serving endpoint.

state: EndpointState | None = None

Information corresponding to the state of the serving endpoint.

tags: List[EndpointTag] | None = None

Tags attached to the serving endpoint.

task: str | None = None

The task type of the serving endpoint.

as_dict() dict

Serializes the ServingEndpoint into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpoint

Deserializes the ServingEndpoint from a dictionary.

class databricks.sdk.service.serving.ServingEndpointAccessControlRequest
group_name: str | None = None

name of the group

permission_level: ServingEndpointPermissionLevel | None = None

Permission level

service_principal_name: str | None = None

application ID of a service principal

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ServingEndpointAccessControlRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointAccessControlRequest

Deserializes the ServingEndpointAccessControlRequest from a dictionary.

class databricks.sdk.service.serving.ServingEndpointAccessControlResponse
all_permissions: List[ServingEndpointPermission] | None = None

All permissions.

display_name: str | None = None

Display name of the user or service principal.

group_name: str | None = None

name of the group

service_principal_name: str | None = None

Name of the service principal.

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ServingEndpointAccessControlResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointAccessControlResponse

Deserializes the ServingEndpointAccessControlResponse from a dictionary.

class databricks.sdk.service.serving.ServingEndpointDetailed
config: EndpointCoreConfigOutput | None = None

The config that is currently being served by the endpoint.

creation_timestamp: int | None = None

The timestamp when the endpoint was created in Unix time.

creator: str | None = None

The email of the user who created the serving endpoint.

id: str | None = None

System-generated ID of the endpoint. This is used to refer to the endpoint in the Permissions API

last_updated_timestamp: int | None = None

The timestamp when the endpoint was last updated by a user in Unix time.

name: str | None = None

The name of the serving endpoint.

pending_config: EndpointPendingConfig | None = None

The config that the endpoint is attempting to update to.

permission_level: ServingEndpointDetailedPermissionLevel | None = None

The permission level of the principal making the request.

state: EndpointState | None = None

Information corresponding to the state of the serving endpoint.

tags: List[EndpointTag] | None = None

Tags attached to the serving endpoint.

task: str | None = None

The task type of the serving endpoint.

as_dict() dict

Serializes the ServingEndpointDetailed into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointDetailed

Deserializes the ServingEndpointDetailed from a dictionary.

class databricks.sdk.service.serving.ServingEndpointDetailedPermissionLevel

The permission level of the principal making the request.

CAN_MANAGE = "CAN_MANAGE"
CAN_QUERY = "CAN_QUERY"
CAN_VIEW = "CAN_VIEW"
class databricks.sdk.service.serving.ServingEndpointPermission
inherited: bool | None = None
inherited_from_object: List[str] | None = None
permission_level: ServingEndpointPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ServingEndpointPermission into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointPermission

Deserializes the ServingEndpointPermission from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionLevel

Permission level

CAN_MANAGE = "CAN_MANAGE"
CAN_QUERY = "CAN_QUERY"
CAN_VIEW = "CAN_VIEW"
class databricks.sdk.service.serving.ServingEndpointPermissions
access_control_list: List[ServingEndpointAccessControlResponse] | None = None
object_id: str | None = None
object_type: str | None = None
as_dict() dict

Serializes the ServingEndpointPermissions into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointPermissions

Deserializes the ServingEndpointPermissions from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionsDescription
description: str | None = None
permission_level: ServingEndpointPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ServingEndpointPermissionsDescription into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointPermissionsDescription

Deserializes the ServingEndpointPermissionsDescription from a dictionary.

class databricks.sdk.service.serving.ServingEndpointPermissionsRequest
access_control_list: List[ServingEndpointAccessControlRequest] | None = None
serving_endpoint_id: str | None = None

The serving endpoint for which to get or manage permissions.

as_dict() dict

Serializes the ServingEndpointPermissionsRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ServingEndpointPermissionsRequest

Deserializes the ServingEndpointPermissionsRequest from a dictionary.

class databricks.sdk.service.serving.StopAppRequest
name: str | None = None

The name of the app.

class databricks.sdk.service.serving.StopAppResponse
as_dict() dict

Serializes the StopAppResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) StopAppResponse

Deserializes the StopAppResponse from a dictionary.

class databricks.sdk.service.serving.TrafficConfig
routes: List[Route] | None = None

The list of routes that define traffic to each served entity.

as_dict() dict

Serializes the TrafficConfig into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) TrafficConfig

Deserializes the TrafficConfig from a dictionary.

class databricks.sdk.service.serving.UpdateAppRequest
name: str

The name of the app. The name must contain only lowercase alphanumeric characters and hyphens and be between 2 and 30 characters long. It must be unique within the workspace.

description: str | None = None

The description of the app.

as_dict() dict

Serializes the UpdateAppRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UpdateAppRequest

Deserializes the UpdateAppRequest from a dictionary.

class databricks.sdk.service.serving.V1ResponseChoiceElement
finish_reason: str | None = None

The finish reason returned by the endpoint.

index: int | None = None

The index of the choice in the __chat or completions__ response.

logprobs: int | None = None

The logprobs returned only by the __completions__ endpoint.

message: ChatMessage | None = None

The message response from the __chat__ endpoint.

text: str | None = None

The text response from the __completions__ endpoint.

as_dict() dict

Serializes the V1ResponseChoiceElement into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) V1ResponseChoiceElement

Deserializes the V1ResponseChoiceElement from a dictionary.