Compute

These dataclasses are used in the SDK to represent API requests and responses for services in the databricks.sdk.service.compute module.

class databricks.sdk.service.compute.AddInstanceProfile
instance_profile_arn: str

The AWS ARN of the instance profile to register with Databricks. This field is required.

iam_role_arn: str | None = None

The AWS IAM role ARN of the role associated with the instance profile. This field is required if your role name and instance profile name do not match and you want to use the instance profile with [Databricks SQL Serverless].

Otherwise, this field is optional.

[Databricks SQL Serverless]: https://docs.databricks.com/sql/admin/serverless.html

is_meta_instance_profile: bool | None = None

Boolean flag indicating whether the instance profile should only be used in credential passthrough scenarios. If true, it means the instance profile contains an meta IAM role which could assume a wide range of roles. Therefore it should always be used with authorization. This field is optional, the default value is false.

skip_validation: bool | None = None

By default, Databricks validates that it has sufficient permissions to launch instances with the instance profile. This validation uses AWS dry-run mode for the RunInstances API. If validation fails with an error message that does not indicate an IAM related permission issue, (e.g. “Your requested instance type is not supported in your requested availability zone”), you can pass this flag to skip the validation and forcibly add the instance profile.

as_dict() dict

Serializes the AddInstanceProfile into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AddInstanceProfile

Deserializes the AddInstanceProfile from a dictionary.

class databricks.sdk.service.compute.AddResponse
as_dict() dict

Serializes the AddResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AddResponse

Deserializes the AddResponse from a dictionary.

class databricks.sdk.service.compute.Adlsgen2Info
destination: str

abfss destination, e.g. abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<directory-name>.

as_dict() dict

Serializes the Adlsgen2Info into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Adlsgen2Info

Deserializes the Adlsgen2Info from a dictionary.

class databricks.sdk.service.compute.AutoScale
max_workers: int | None = None

The maximum number of workers to which the cluster can scale up when overloaded. Note that max_workers must be strictly greater than min_workers.

min_workers: int | None = None

The minimum number of workers to which the cluster can scale down when underutilized. It is also the initial number of workers the cluster will have after creation.

as_dict() dict

Serializes the AutoScale into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AutoScale

Deserializes the AutoScale from a dictionary.

class databricks.sdk.service.compute.AwsAttributes
availability: AwsAvailability | None = None

Availability type used for all subsequent nodes past the first_on_demand ones.

Note: If first_on_demand is zero, this availability type will be used for the entire cluster.

ebs_volume_count: int | None = None

The number of volumes launched for each instance. Users can choose up to 10 volumes. This feature is only enabled for supported node types. Legacy node types cannot specify custom EBS volumes. For node types with no instance store, at least one EBS volume needs to be specified; otherwise, cluster creation will fail.

These EBS volumes will be mounted at /ebs0, /ebs1, and etc. Instance store volumes will be mounted at /local_disk0, /local_disk1, and etc.

If EBS volumes are attached, Databricks will configure Spark to use only the EBS volumes for scratch storage because heterogenously sized scratch devices can lead to inefficient disk utilization. If no EBS volumes are attached, Databricks will configure Spark to use instance store volumes.

Please note that if EBS volumes are specified, then the Spark configuration spark.local.dir will be overridden.

ebs_volume_iops: int | None = None

If using gp3 volumes, what IOPS to use for the disk. If this is not set, the maximum performance of a gp2 volume with the same volume size will be used.

ebs_volume_size: int | None = None

The size of each EBS volume (in GiB) launched for each instance. For general purpose SSD, this value must be within the range 100 - 4096. For throughput optimized HDD, this value must be within the range 500 - 4096.

ebs_volume_throughput: int | None = None

If using gp3 volumes, what throughput to use for the disk. If this is not set, the maximum performance of a gp2 volume with the same volume size will be used.

ebs_volume_type: EbsVolumeType | None = None

The type of EBS volumes that will be launched with this cluster.

first_on_demand: int | None = None

The first first_on_demand nodes of the cluster will be placed on on-demand instances. If this value is greater than 0, the cluster driver node in particular will be placed on an on-demand instance. If this value is greater than or equal to the current cluster size, all nodes will be placed on on-demand instances. If this value is less than the current cluster size, first_on_demand nodes will be placed on on-demand instances and the remainder will be placed on availability instances. Note that this value does not affect cluster size and cannot currently be mutated over the lifetime of a cluster.

instance_profile_arn: str | None = None

Nodes for this cluster will only be placed on AWS instances with this instance profile. If ommitted, nodes will be placed on instances without an IAM instance profile. The instance profile must have previously been added to the Databricks environment by an account administrator.

This feature may only be available to certain customer plans.

If this field is ommitted, we will pull in the default from the conf if it exists.

spot_bid_price_percent: int | None = None

The bid price for AWS spot instances, as a percentage of the corresponding instance type’s on-demand price. For example, if this field is set to 50, and the cluster needs a new r3.xlarge spot instance, then the bid price is half of the price of on-demand r3.xlarge instances. Similarly, if this field is set to 200, the bid price is twice the price of on-demand r3.xlarge instances. If not specified, the default value is 100. When spot instances are requested for this cluster, only spot instances whose bid price percentage matches this field will be considered. Note that, for safety, we enforce this field to be no more than 10000.

The default value and documentation here should be kept consistent with CommonConf.defaultSpotBidPricePercent and CommonConf.maxSpotBidPricePercent.

zone_id: str | None = None

Identifier for the availability zone/datacenter in which the cluster resides. This string will be of a form like “us-west-2a”. The provided availability zone must be in the same region as the Databricks deployment. For example, “us-west-2a” is not a valid zone id if the Databricks deployment resides in the “us-east-1” region. This is an optional field at cluster creation, and if not specified, a default zone will be used. If the zone specified is “auto”, will try to place cluster in a zone with high availability, and will retry placement in a different AZ if there is not enough capacity. The list of available zones as well as the default value can be found by using the List Zones method.

as_dict() dict

Serializes the AwsAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AwsAttributes

Deserializes the AwsAttributes from a dictionary.

class databricks.sdk.service.compute.AwsAvailability

Availability type used for all subsequent nodes past the first_on_demand ones. Note: If first_on_demand is zero, this availability type will be used for the entire cluster.

ON_DEMAND = "ON_DEMAND"
SPOT = "SPOT"
SPOT_WITH_FALLBACK = "SPOT_WITH_FALLBACK"
class databricks.sdk.service.compute.AzureAttributes
availability: AzureAvailability | None = None

Availability type used for all subsequent nodes past the first_on_demand ones. Note: If first_on_demand is zero (which only happens on pool clusters), this availability type will be used for the entire cluster.

first_on_demand: int | None = None

The first first_on_demand nodes of the cluster will be placed on on-demand instances. This value should be greater than 0, to make sure the cluster driver node is placed on an on-demand instance. If this value is greater than or equal to the current cluster size, all nodes will be placed on on-demand instances. If this value is less than the current cluster size, first_on_demand nodes will be placed on on-demand instances and the remainder will be placed on availability instances. Note that this value does not affect cluster size and cannot currently be mutated over the lifetime of a cluster.

log_analytics_info: LogAnalyticsInfo | None = None

Defines values necessary to configure and run Azure Log Analytics agent

spot_bid_max_price: float | None = None

The max bid price to be used for Azure spot instances. The Max price for the bid cannot be higher than the on-demand price of the instance. If not specified, the default value is -1, which specifies that the instance cannot be evicted on the basis of price, and only on the basis of availability. Further, the value should > 0 or -1.

as_dict() dict

Serializes the AzureAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) AzureAttributes

Deserializes the AzureAttributes from a dictionary.

class databricks.sdk.service.compute.AzureAvailability

Availability type used for all subsequent nodes past the first_on_demand ones. Note: If first_on_demand is zero (which only happens on pool clusters), this availability type will be used for the entire cluster.

ON_DEMAND_AZURE = "ON_DEMAND_AZURE"
SPOT_AZURE = "SPOT_AZURE"
SPOT_WITH_FALLBACK_AZURE = "SPOT_WITH_FALLBACK_AZURE"
class databricks.sdk.service.compute.CancelCommand
cluster_id: str | None = None
command_id: str | None = None
context_id: str | None = None
as_dict() dict

Serializes the CancelCommand into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CancelCommand

Deserializes the CancelCommand from a dictionary.

class databricks.sdk.service.compute.CancelResponse
as_dict() dict

Serializes the CancelResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CancelResponse

Deserializes the CancelResponse from a dictionary.

class databricks.sdk.service.compute.ChangeClusterOwner
cluster_id: str

<needs content added>

owner_username: str

New owner of the cluster_id after this RPC.

as_dict() dict

Serializes the ChangeClusterOwner into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ChangeClusterOwner

Deserializes the ChangeClusterOwner from a dictionary.

class databricks.sdk.service.compute.ChangeClusterOwnerResponse
as_dict() dict

Serializes the ChangeClusterOwnerResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ChangeClusterOwnerResponse

Deserializes the ChangeClusterOwnerResponse from a dictionary.

class databricks.sdk.service.compute.ClientsTypes
jobs: bool | None = None

With jobs set, the cluster can be used for jobs

notebooks: bool | None = None

With notebooks set, this cluster can be used for notebooks

as_dict() dict

Serializes the ClientsTypes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClientsTypes

Deserializes the ClientsTypes from a dictionary.

class databricks.sdk.service.compute.CloneCluster
source_cluster_id: str

The cluster that is being cloned.

as_dict() dict

Serializes the CloneCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CloneCluster

Deserializes the CloneCluster from a dictionary.

class databricks.sdk.service.compute.CloudProviderNodeInfo
status: List[CloudProviderNodeStatus] | None = None
as_dict() dict

Serializes the CloudProviderNodeInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CloudProviderNodeInfo

Deserializes the CloudProviderNodeInfo from a dictionary.

class databricks.sdk.service.compute.CloudProviderNodeStatus
NOT_AVAILABLE_IN_REGION = "NOT_AVAILABLE_IN_REGION"
NOT_ENABLED_ON_SUBSCRIPTION = "NOT_ENABLED_ON_SUBSCRIPTION"
class databricks.sdk.service.compute.ClusterAccessControlRequest
group_name: str | None = None

name of the group

permission_level: ClusterPermissionLevel | None = None

Permission level

service_principal_name: str | None = None

application ID of a service principal

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ClusterAccessControlRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterAccessControlRequest

Deserializes the ClusterAccessControlRequest from a dictionary.

class databricks.sdk.service.compute.ClusterAccessControlResponse
all_permissions: List[ClusterPermission] | None = None

All permissions.

display_name: str | None = None

Display name of the user or service principal.

group_name: str | None = None

name of the group

service_principal_name: str | None = None

Name of the service principal.

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ClusterAccessControlResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterAccessControlResponse

Deserializes the ClusterAccessControlResponse from a dictionary.

class databricks.sdk.service.compute.ClusterAttributes
spark_version: str

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

autotermination_minutes: int | None = None

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes: AwsAttributes | None = None

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

azure_attributes: AzureAttributes | None = None

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

cluster_log_conf: ClusterLogConf | None = None

The configuration for delivering spark logs to a long-term storage destination. Two kinds of destinations (dbfs and s3) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

cluster_name: str | None = None

Cluster name requested by the user. This doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

cluster_source: ClusterSource | None = None

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

custom_tags: Dict[str, str] | None = None

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

  • Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster

tags

data_security_mode: DataSecurityMode | None = None

Data security mode decides what data governance model to use when accessing data from a cluster.

  • NONE: No security isolation for multiple users sharing the cluster. Data governance features

are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

docker_image: DockerImage | None = None
driver_instance_pool_id: str | None = None

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id: str | None = None

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption: bool | None = None

Whether to enable LUKS on cluster VMs’ local disks

gcp_attributes: GcpAttributes | None = None

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

init_scripts: List[InitScriptInfo] | None = None

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

instance_pool_id: str | None = None

The optional ID of the instance pool to which the cluster belongs.

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

policy_id: str | None = None

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine: RuntimeEngine | None = None

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

single_user_name: str | None = None

Single user name if data_security_mode is SINGLE_USER

spark_conf: Dict[str, str] | None = None

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

spark_env_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}

ssh_public_keys: List[str] | None = None

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

workload_type: WorkloadType | None = None
as_dict() dict

Serializes the ClusterAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterAttributes

Deserializes the ClusterAttributes from a dictionary.

class databricks.sdk.service.compute.ClusterDetails
autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

autotermination_minutes: int | None = None

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes: AwsAttributes | None = None

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

azure_attributes: AzureAttributes | None = None

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

cluster_cores: float | None = None

Number of CPU cores available for this cluster. Note that this can be fractional, e.g. 7.5 cores, since certain node types are configured to share cores between Spark nodes on the same instance.

cluster_id: str | None = None

Canonical identifier for the cluster. This id is retained during cluster restarts and resizes, while each new cluster has a globally unique id.

cluster_log_conf: ClusterLogConf | None = None

The configuration for delivering spark logs to a long-term storage destination. Two kinds of destinations (dbfs and s3) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

cluster_log_status: LogSyncStatus | None = None

Cluster log delivery status.

cluster_memory_mb: int | None = None

Total amount of cluster memory, in megabytes

cluster_name: str | None = None

Cluster name requested by the user. This doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

cluster_source: ClusterSource | None = None

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

creator_user_name: str | None = None

Creator user name. The field won’t be included in the response if the user has already been deleted.

custom_tags: Dict[str, str] | None = None

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

  • Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster

tags

data_security_mode: DataSecurityMode | None = None

Data security mode decides what data governance model to use when accessing data from a cluster.

  • NONE: No security isolation for multiple users sharing the cluster. Data governance features

are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

default_tags: Dict[str, str] | None = None

Tags that are added by Databricks regardless of any custom_tags, including:

  • Vendor: Databricks

  • Creator: <username_of_creator>

  • ClusterName: <name_of_cluster>

  • ClusterId: <id_of_cluster>

  • Name: <Databricks internal use>

docker_image: DockerImage | None = None
driver: SparkNode | None = None

Node on which the Spark driver resides. The driver node contains the Spark master and the <Databricks> application that manages the per-notebook Spark REPLs.

driver_instance_pool_id: str | None = None

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id: str | None = None

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption: bool | None = None

Whether to enable LUKS on cluster VMs’ local disks

executors: List[SparkNode] | None = None

Nodes on which the Spark executors reside.

gcp_attributes: GcpAttributes | None = None

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

init_scripts: List[InitScriptInfo] | None = None

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

instance_pool_id: str | None = None

The optional ID of the instance pool to which the cluster belongs.

jdbc_port: int | None = None

Port on which Spark JDBC server is listening, in the driver nod. No service will be listeningon on this port in executor nodes.

last_restarted_time: int | None = None

the timestamp that the cluster was started/restarted

last_state_loss_time: int | None = None

Time when the cluster driver last lost its state (due to a restart or driver failure).

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

policy_id: str | None = None

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine: RuntimeEngine | None = None

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

single_user_name: str | None = None

Single user name if data_security_mode is SINGLE_USER

spark_conf: Dict[str, str] | None = None

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

spark_context_id: int | None = None

A canonical SparkContext identifier. This value does change when the Spark driver restarts. The pair (cluster_id, spark_context_id) is a globally unique identifier over all Spark contexts.

spark_env_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}

spark_version: str | None = None

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

spec: CreateCluster | None = None

spec contains a snapshot of the field values that were used to create or edit this cluster. The contents of spec can be used in the body of a create cluster request. This field might not be populated for older clusters. Note: not included in the response of the ListClusters API.

ssh_public_keys: List[str] | None = None

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

start_time: int | None = None

Time (in epoch milliseconds) when the cluster creation request was received (when the cluster entered a PENDING state).

state: State | None = None

Current state of the cluster.

state_message: str | None = None

A message associated with the most recent state transition (e.g., the reason why the cluster entered a TERMINATED state).

terminated_time: int | None = None

Time (in epoch milliseconds) when the cluster was terminated, if applicable.

termination_reason: TerminationReason | None = None

Information about why the cluster was terminated. This field only appears when the cluster is in a TERMINATING or TERMINATED state.

workload_type: WorkloadType | None = None
as_dict() dict

Serializes the ClusterDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterDetails

Deserializes the ClusterDetails from a dictionary.

class databricks.sdk.service.compute.ClusterEvent
cluster_id: str

<needs content added>

data_plane_event_details: DataPlaneEventDetails | None = None

<needs content added>

details: EventDetails | None = None

<needs content added>

timestamp: int | None = None

The timestamp when the event occurred, stored as the number of milliseconds since the Unix epoch. If not provided, this will be assigned by the Timeline service.

type: EventType | None = None
as_dict() dict

Serializes the ClusterEvent into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterEvent

Deserializes the ClusterEvent from a dictionary.

class databricks.sdk.service.compute.ClusterLibraryStatuses
cluster_id: str | None = None

Unique identifier for the cluster.

library_statuses: List[LibraryFullStatus] | None = None

Status of all libraries on the cluster.

as_dict() dict

Serializes the ClusterLibraryStatuses into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterLibraryStatuses

Deserializes the ClusterLibraryStatuses from a dictionary.

class databricks.sdk.service.compute.ClusterLogConf
dbfs: DbfsStorageInfo | None = None

destination needs to be provided. e.g. { “dbfs” : { “destination” : “dbfs:/home/cluster_log” } }

s3: S3StorageInfo | None = None

destination and either the region or endpoint need to be provided. e.g. { “s3”: { “destination” : “s3://cluster_log_bucket/prefix”, “region” : “us-west-2” } } Cluster iam role is used to access s3, please make sure the cluster iam role in instance_profile_arn has permission to write data to the s3 destination.

as_dict() dict

Serializes the ClusterLogConf into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterLogConf

Deserializes the ClusterLogConf from a dictionary.

class databricks.sdk.service.compute.ClusterPermission
inherited: bool | None = None
inherited_from_object: List[str] | None = None
permission_level: ClusterPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ClusterPermission into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPermission

Deserializes the ClusterPermission from a dictionary.

class databricks.sdk.service.compute.ClusterPermissionLevel

Permission level

CAN_ATTACH_TO = "CAN_ATTACH_TO"
CAN_MANAGE = "CAN_MANAGE"
CAN_RESTART = "CAN_RESTART"
class databricks.sdk.service.compute.ClusterPermissions
access_control_list: List[ClusterAccessControlResponse] | None = None
object_id: str | None = None
object_type: str | None = None
as_dict() dict

Serializes the ClusterPermissions into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPermissions

Deserializes the ClusterPermissions from a dictionary.

class databricks.sdk.service.compute.ClusterPermissionsDescription
description: str | None = None
permission_level: ClusterPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ClusterPermissionsDescription into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPermissionsDescription

Deserializes the ClusterPermissionsDescription from a dictionary.

class databricks.sdk.service.compute.ClusterPermissionsRequest
access_control_list: List[ClusterAccessControlRequest] | None = None
cluster_id: str | None = None

The cluster for which to get or manage permissions.

as_dict() dict

Serializes the ClusterPermissionsRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPermissionsRequest

Deserializes the ClusterPermissionsRequest from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyAccessControlRequest
group_name: str | None = None

name of the group

permission_level: ClusterPolicyPermissionLevel | None = None

Permission level

service_principal_name: str | None = None

application ID of a service principal

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ClusterPolicyAccessControlRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyAccessControlRequest

Deserializes the ClusterPolicyAccessControlRequest from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyAccessControlResponse
all_permissions: List[ClusterPolicyPermission] | None = None

All permissions.

display_name: str | None = None

Display name of the user or service principal.

group_name: str | None = None

name of the group

service_principal_name: str | None = None

Name of the service principal.

user_name: str | None = None

name of the user

as_dict() dict

Serializes the ClusterPolicyAccessControlResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyAccessControlResponse

Deserializes the ClusterPolicyAccessControlResponse from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyPermission
inherited: bool | None = None
inherited_from_object: List[str] | None = None
permission_level: ClusterPolicyPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ClusterPolicyPermission into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyPermission

Deserializes the ClusterPolicyPermission from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyPermissionLevel

Permission level

CAN_USE = "CAN_USE"
class databricks.sdk.service.compute.ClusterPolicyPermissions
access_control_list: List[ClusterPolicyAccessControlResponse] | None = None
object_id: str | None = None
object_type: str | None = None
as_dict() dict

Serializes the ClusterPolicyPermissions into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyPermissions

Deserializes the ClusterPolicyPermissions from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyPermissionsDescription
description: str | None = None
permission_level: ClusterPolicyPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the ClusterPolicyPermissionsDescription into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyPermissionsDescription

Deserializes the ClusterPolicyPermissionsDescription from a dictionary.

class databricks.sdk.service.compute.ClusterPolicyPermissionsRequest
access_control_list: List[ClusterPolicyAccessControlRequest] | None = None
cluster_policy_id: str | None = None

The cluster policy for which to get or manage permissions.

as_dict() dict

Serializes the ClusterPolicyPermissionsRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterPolicyPermissionsRequest

Deserializes the ClusterPolicyPermissionsRequest from a dictionary.

class databricks.sdk.service.compute.ClusterSize
autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

as_dict() dict

Serializes the ClusterSize into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterSize

Deserializes the ClusterSize from a dictionary.

class databricks.sdk.service.compute.ClusterSource

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

API = "API"
JOB = "JOB"
MODELS = "MODELS"
PIPELINE = "PIPELINE"
PIPELINE_MAINTENANCE = "PIPELINE_MAINTENANCE"
SQL = "SQL"
UI = "UI"
class databricks.sdk.service.compute.ClusterSpec
apply_policy_default_values: bool | None = None
autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

autotermination_minutes: int | None = None

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes: AwsAttributes | None = None

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

azure_attributes: AzureAttributes | None = None

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

clone_from: CloneCluster | None = None

When specified, this clones libraries from a source cluster during the creation of a new cluster.

cluster_log_conf: ClusterLogConf | None = None

The configuration for delivering spark logs to a long-term storage destination. Two kinds of destinations (dbfs and s3) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

cluster_name: str | None = None

Cluster name requested by the user. This doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

cluster_source: ClusterSource | None = None

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

custom_tags: Dict[str, str] | None = None

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

  • Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster

tags

data_security_mode: DataSecurityMode | None = None

Data security mode decides what data governance model to use when accessing data from a cluster.

  • NONE: No security isolation for multiple users sharing the cluster. Data governance features

are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

docker_image: DockerImage | None = None
driver_instance_pool_id: str | None = None

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id: str | None = None

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption: bool | None = None

Whether to enable LUKS on cluster VMs’ local disks

gcp_attributes: GcpAttributes | None = None

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

init_scripts: List[InitScriptInfo] | None = None

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

instance_pool_id: str | None = None

The optional ID of the instance pool to which the cluster belongs.

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

policy_id: str | None = None

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine: RuntimeEngine | None = None

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

single_user_name: str | None = None

Single user name if data_security_mode is SINGLE_USER

spark_conf: Dict[str, str] | None = None

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

spark_env_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}

spark_version: str | None = None

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

ssh_public_keys: List[str] | None = None

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

workload_type: WorkloadType | None = None
as_dict() dict

Serializes the ClusterSpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterSpec

Deserializes the ClusterSpec from a dictionary.

class databricks.sdk.service.compute.ClusterStatusResponse
cluster_id: str | None = None

Unique identifier for the cluster.

library_statuses: List[LibraryFullStatus] | None = None

Status of all libraries on the cluster.

as_dict() dict

Serializes the ClusterStatusResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ClusterStatusResponse

Deserializes the ClusterStatusResponse from a dictionary.

class databricks.sdk.service.compute.Command
cluster_id: str | None = None

Running cluster id

command: str | None = None

Executable code

context_id: str | None = None

Running context id

language: Language | None = None
as_dict() dict

Serializes the Command into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Command

Deserializes the Command from a dictionary.

class databricks.sdk.service.compute.CommandStatus
CANCELLED = "CANCELLED"
CANCELLING = "CANCELLING"
ERROR = "ERROR"
FINISHED = "FINISHED"
QUEUED = "QUEUED"
RUNNING = "RUNNING"
class databricks.sdk.service.compute.CommandStatusResponse
id: str | None = None
results: Results | None = None
status: CommandStatus | None = None
as_dict() dict

Serializes the CommandStatusResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CommandStatusResponse

Deserializes the CommandStatusResponse from a dictionary.

class databricks.sdk.service.compute.ContextStatus
ERROR = "ERROR"
PENDING = "PENDING"
RUNNING = "RUNNING"
class databricks.sdk.service.compute.ContextStatusResponse
id: str | None = None
status: ContextStatus | None = None
as_dict() dict

Serializes the ContextStatusResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ContextStatusResponse

Deserializes the ContextStatusResponse from a dictionary.

class databricks.sdk.service.compute.CreateCluster
spark_version: str

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

apply_policy_default_values: bool | None = None
autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

autotermination_minutes: int | None = None

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes: AwsAttributes | None = None

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

azure_attributes: AzureAttributes | None = None

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

clone_from: CloneCluster | None = None

When specified, this clones libraries from a source cluster during the creation of a new cluster.

cluster_log_conf: ClusterLogConf | None = None

The configuration for delivering spark logs to a long-term storage destination. Two kinds of destinations (dbfs and s3) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

cluster_name: str | None = None

Cluster name requested by the user. This doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

cluster_source: ClusterSource | None = None

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

custom_tags: Dict[str, str] | None = None

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

  • Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster

tags

data_security_mode: DataSecurityMode | None = None

Data security mode decides what data governance model to use when accessing data from a cluster.

  • NONE: No security isolation for multiple users sharing the cluster. Data governance features

are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

docker_image: DockerImage | None = None
driver_instance_pool_id: str | None = None

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id: str | None = None

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption: bool | None = None

Whether to enable LUKS on cluster VMs’ local disks

gcp_attributes: GcpAttributes | None = None

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

init_scripts: List[InitScriptInfo] | None = None

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

instance_pool_id: str | None = None

The optional ID of the instance pool to which the cluster belongs.

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

policy_id: str | None = None

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine: RuntimeEngine | None = None

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

single_user_name: str | None = None

Single user name if data_security_mode is SINGLE_USER

spark_conf: Dict[str, str] | None = None

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

spark_env_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}

ssh_public_keys: List[str] | None = None

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

workload_type: WorkloadType | None = None
as_dict() dict

Serializes the CreateCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateCluster

Deserializes the CreateCluster from a dictionary.

class databricks.sdk.service.compute.CreateClusterResponse
cluster_id: str | None = None
as_dict() dict

Serializes the CreateClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateClusterResponse

Deserializes the CreateClusterResponse from a dictionary.

class databricks.sdk.service.compute.CreateContext
cluster_id: str | None = None

Running cluster id

language: Language | None = None
as_dict() dict

Serializes the CreateContext into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateContext

Deserializes the CreateContext from a dictionary.

class databricks.sdk.service.compute.CreateInstancePool
instance_pool_name: str

Pool name requested by the user. Pool name must be unique. Length must be between 1 and 100 characters.

node_type_id: str

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

aws_attributes: InstancePoolAwsAttributes | None = None

Attributes related to instance pools running on Amazon Web Services. If not specified at pool creation, a set of default values will be used.

azure_attributes: InstancePoolAzureAttributes | None = None

Attributes related to instance pools running on Azure. If not specified at pool creation, a set of default values will be used.

custom_tags: Dict[str, str] | None = None

Additional tags for pool resources. Databricks will tag all pool resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

disk_spec: DiskSpec | None = None

Defines the specification of the disks that will be attached to all spark containers.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this instances in this pool will dynamically acquire additional disk space when its Spark workers are running low on disk space. In AWS, this feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

gcp_attributes: InstancePoolGcpAttributes | None = None

Attributes related to instance pools running on Google Cloud Platform. If not specified at pool creation, a set of default values will be used.

idle_instance_autotermination_minutes: int | None = None

Automatically terminates the extra instances in the pool cache after they are inactive for this time in minutes if min_idle_instances requirement is already met. If not set, the extra pool instances will be automatically terminated after a default timeout. If specified, the threshold must be between 0 and 10000 minutes. Users can also set this value to 0 to instantly remove idle instances from the cache if min cache size could still hold.

max_capacity: int | None = None

Maximum number of outstanding instances to keep in the pool, including both instances used by clusters and idle instances. Clusters that require further instance provisioning will fail during upsize requests.

min_idle_instances: int | None = None

Minimum number of idle instances to keep in the instance pool

preloaded_docker_images: List[DockerImage] | None = None

Custom Docker Image BYOC

preloaded_spark_versions: List[str] | None = None

A list containing at most one preloaded Spark image version for the pool. Pool-backed clusters started with the preloaded Spark version will start faster. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

as_dict() dict

Serializes the CreateInstancePool into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateInstancePool

Deserializes the CreateInstancePool from a dictionary.

class databricks.sdk.service.compute.CreateInstancePoolResponse
instance_pool_id: str | None = None

The ID of the created instance pool.

as_dict() dict

Serializes the CreateInstancePoolResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateInstancePoolResponse

Deserializes the CreateInstancePoolResponse from a dictionary.

class databricks.sdk.service.compute.CreatePolicy
name: str

Cluster Policy name requested by the user. This has to be unique. Length must be between 1 and 100 characters.

definition: str | None = None

Policy definition document expressed in [Databricks Cluster Policy Definition Language].

[Databricks Cluster Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

description: str | None = None

Additional human-readable description of the cluster policy.

libraries: List[Library] | None = None

A list of libraries to be installed on the next cluster restart that uses this policy. The maximum number of libraries is 500.

max_clusters_per_user: int | None = None

Max number of clusters per user that can be active using this policy. If not present, there is no max limit.

policy_family_definition_overrides: str | None = None

Policy definition JSON document expressed in [Databricks Policy Definition Language]. The JSON document must be passed as a string and cannot be embedded in the requests.

You can use this to customize the policy definition inherited from the policy family. Policy rules specified here are merged into the inherited policy definition.

[Databricks Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

policy_family_id: str | None = None

ID of the policy family. The cluster policy’s policy definition inherits the policy family’s policy definition.

Cannot be used with definition. Use policy_family_definition_overrides instead to customize the policy definition.

as_dict() dict

Serializes the CreatePolicy into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreatePolicy

Deserializes the CreatePolicy from a dictionary.

class databricks.sdk.service.compute.CreatePolicyResponse
policy_id: str | None = None

Canonical unique identifier for the cluster policy.

as_dict() dict

Serializes the CreatePolicyResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreatePolicyResponse

Deserializes the CreatePolicyResponse from a dictionary.

class databricks.sdk.service.compute.CreateResponse
script_id: str | None = None

The global init script ID.

as_dict() dict

Serializes the CreateResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) CreateResponse

Deserializes the CreateResponse from a dictionary.

class databricks.sdk.service.compute.Created
id: str | None = None
as_dict() dict

Serializes the Created into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Created

Deserializes the Created from a dictionary.

class databricks.sdk.service.compute.DataPlaneEventDetails
event_type: DataPlaneEventDetailsEventType | None = None

<needs content added>

executor_failures: int | None = None

<needs content added>

host_id: str | None = None

<needs content added>

timestamp: int | None = None

<needs content added>

as_dict() dict

Serializes the DataPlaneEventDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DataPlaneEventDetails

Deserializes the DataPlaneEventDetails from a dictionary.

class databricks.sdk.service.compute.DataPlaneEventDetailsEventType

<needs content added>

NODE_BLACKLISTED = "NODE_BLACKLISTED"
NODE_EXCLUDED_DECOMMISSIONED = "NODE_EXCLUDED_DECOMMISSIONED"
class databricks.sdk.service.compute.DataSecurityMode

Data security mode decides what data governance model to use when accessing data from a cluster. * NONE: No security isolation for multiple users sharing the cluster. Data governance features are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

LEGACY_PASSTHROUGH = "LEGACY_PASSTHROUGH"
LEGACY_SINGLE_USER = "LEGACY_SINGLE_USER"
LEGACY_TABLE_ACL = "LEGACY_TABLE_ACL"
NONE = "NONE"
SINGLE_USER = "SINGLE_USER"
USER_ISOLATION = "USER_ISOLATION"
class databricks.sdk.service.compute.DbfsStorageInfo
destination: str

dbfs destination, e.g. dbfs:/my/path

as_dict() dict

Serializes the DbfsStorageInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DbfsStorageInfo

Deserializes the DbfsStorageInfo from a dictionary.

class databricks.sdk.service.compute.DeleteCluster
cluster_id: str

The cluster to be terminated.

as_dict() dict

Serializes the DeleteCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteCluster

Deserializes the DeleteCluster from a dictionary.

class databricks.sdk.service.compute.DeleteClusterResponse
as_dict() dict

Serializes the DeleteClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteClusterResponse

Deserializes the DeleteClusterResponse from a dictionary.

class databricks.sdk.service.compute.DeleteInstancePool
instance_pool_id: str

The instance pool to be terminated.

as_dict() dict

Serializes the DeleteInstancePool into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteInstancePool

Deserializes the DeleteInstancePool from a dictionary.

class databricks.sdk.service.compute.DeleteInstancePoolResponse
as_dict() dict

Serializes the DeleteInstancePoolResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteInstancePoolResponse

Deserializes the DeleteInstancePoolResponse from a dictionary.

class databricks.sdk.service.compute.DeletePolicy
policy_id: str

The ID of the policy to delete.

as_dict() dict

Serializes the DeletePolicy into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeletePolicy

Deserializes the DeletePolicy from a dictionary.

class databricks.sdk.service.compute.DeletePolicyResponse
as_dict() dict

Serializes the DeletePolicyResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeletePolicyResponse

Deserializes the DeletePolicyResponse from a dictionary.

class databricks.sdk.service.compute.DeleteResponse
as_dict() dict

Serializes the DeleteResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DeleteResponse

Deserializes the DeleteResponse from a dictionary.

class databricks.sdk.service.compute.DestroyContext
cluster_id: str
context_id: str
as_dict() dict

Serializes the DestroyContext into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DestroyContext

Deserializes the DestroyContext from a dictionary.

class databricks.sdk.service.compute.DestroyResponse
as_dict() dict

Serializes the DestroyResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DestroyResponse

Deserializes the DestroyResponse from a dictionary.

class databricks.sdk.service.compute.DiskSpec
disk_count: int | None = None

The number of disks launched for each instance: - This feature is only enabled for supported node types. - Users can choose up to the limit of the disks supported by the node type. - For node types with no OS disk, at least one disk must be specified; otherwise, cluster creation will fail.

If disks are attached, Databricks will configure Spark to use only the disks for scratch storage, because heterogenously sized scratch devices can lead to inefficient disk utilization. If no disks are attached, Databricks will configure Spark to use instance store disks.

Note: If disks are specified, then the Spark configuration spark.local.dir will be overridden.

Disks will be mounted at: - For AWS: /ebs0, /ebs1, and etc. - For Azure: /remote_volume0, /remote_volume1, and etc.

disk_iops: int | None = None
disk_size: int | None = None

The size of each disk (in GiB) launched for each instance. Values must fall into the supported range for a particular instance type.

For AWS: - General Purpose SSD: 100 - 4096 GiB - Throughput Optimized HDD: 500 - 4096 GiB

For Azure: - Premium LRS (SSD): 1 - 1023 GiB - Standard LRS (HDD): 1- 1023 GiB

disk_throughput: int | None = None
disk_type: DiskType | None = None

The type of disks that will be launched with this cluster.

as_dict() dict

Serializes the DiskSpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DiskSpec

Deserializes the DiskSpec from a dictionary.

class databricks.sdk.service.compute.DiskType
azure_disk_volume_type: DiskTypeAzureDiskVolumeType | None = None
ebs_volume_type: DiskTypeEbsVolumeType | None = None
as_dict() dict

Serializes the DiskType into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DiskType

Deserializes the DiskType from a dictionary.

class databricks.sdk.service.compute.DiskTypeAzureDiskVolumeType
PREMIUM_LRS = "PREMIUM_LRS"
STANDARD_LRS = "STANDARD_LRS"
class databricks.sdk.service.compute.DiskTypeEbsVolumeType
GENERAL_PURPOSE_SSD = "GENERAL_PURPOSE_SSD"
THROUGHPUT_OPTIMIZED_HDD = "THROUGHPUT_OPTIMIZED_HDD"
class databricks.sdk.service.compute.DockerBasicAuth
password: str | None = None

Password of the user

username: str | None = None

Name of the user

as_dict() dict

Serializes the DockerBasicAuth into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DockerBasicAuth

Deserializes the DockerBasicAuth from a dictionary.

class databricks.sdk.service.compute.DockerImage
basic_auth: DockerBasicAuth | None = None
url: str | None = None

URL of the docker image.

as_dict() dict

Serializes the DockerImage into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) DockerImage

Deserializes the DockerImage from a dictionary.

class databricks.sdk.service.compute.EbsVolumeType

The type of EBS volumes that will be launched with this cluster.

GENERAL_PURPOSE_SSD = "GENERAL_PURPOSE_SSD"
THROUGHPUT_OPTIMIZED_HDD = "THROUGHPUT_OPTIMIZED_HDD"
class databricks.sdk.service.compute.EditCluster
cluster_id: str

ID of the cluser

spark_version: str

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

apply_policy_default_values: bool | None = None
autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

autotermination_minutes: int | None = None

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes: AwsAttributes | None = None

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

azure_attributes: AzureAttributes | None = None

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

clone_from: CloneCluster | None = None

When specified, this clones libraries from a source cluster during the creation of a new cluster.

cluster_log_conf: ClusterLogConf | None = None

The configuration for delivering spark logs to a long-term storage destination. Two kinds of destinations (dbfs and s3) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

cluster_name: str | None = None

Cluster name requested by the user. This doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

cluster_source: ClusterSource | None = None

Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request. This is the same as cluster_creator, but read only.

custom_tags: Dict[str, str] | None = None

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

  • Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster

tags

data_security_mode: DataSecurityMode | None = None

Data security mode decides what data governance model to use when accessing data from a cluster.

  • NONE: No security isolation for multiple users sharing the cluster. Data governance features

are not available in this mode. * SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. * USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. * LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. * LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. * LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.

docker_image: DockerImage | None = None
driver_instance_pool_id: str | None = None

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id: str | None = None

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption: bool | None = None

Whether to enable LUKS on cluster VMs’ local disks

gcp_attributes: GcpAttributes | None = None

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

init_scripts: List[InitScriptInfo] | None = None

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

instance_pool_id: str | None = None

The optional ID of the instance pool to which the cluster belongs.

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

policy_id: str | None = None

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine: RuntimeEngine | None = None

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

single_user_name: str | None = None

Single user name if data_security_mode is SINGLE_USER

spark_conf: Dict[str, str] | None = None

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

spark_env_vars: Dict[str, str] | None = None

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}

ssh_public_keys: List[str] | None = None

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

workload_type: WorkloadType | None = None
as_dict() dict

Serializes the EditCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditCluster

Deserializes the EditCluster from a dictionary.

class databricks.sdk.service.compute.EditClusterResponse
as_dict() dict

Serializes the EditClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditClusterResponse

Deserializes the EditClusterResponse from a dictionary.

class databricks.sdk.service.compute.EditInstancePool
instance_pool_id: str

Instance pool ID

instance_pool_name: str

Pool name requested by the user. Pool name must be unique. Length must be between 1 and 100 characters.

node_type_id: str

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

custom_tags: Dict[str, str] | None = None

Additional tags for pool resources. Databricks will tag all pool resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

idle_instance_autotermination_minutes: int | None = None

Automatically terminates the extra instances in the pool cache after they are inactive for this time in minutes if min_idle_instances requirement is already met. If not set, the extra pool instances will be automatically terminated after a default timeout. If specified, the threshold must be between 0 and 10000 minutes. Users can also set this value to 0 to instantly remove idle instances from the cache if min cache size could still hold.

max_capacity: int | None = None

Maximum number of outstanding instances to keep in the pool, including both instances used by clusters and idle instances. Clusters that require further instance provisioning will fail during upsize requests.

min_idle_instances: int | None = None

Minimum number of idle instances to keep in the instance pool

as_dict() dict

Serializes the EditInstancePool into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditInstancePool

Deserializes the EditInstancePool from a dictionary.

class databricks.sdk.service.compute.EditInstancePoolResponse
as_dict() dict

Serializes the EditInstancePoolResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditInstancePoolResponse

Deserializes the EditInstancePoolResponse from a dictionary.

class databricks.sdk.service.compute.EditPolicy
policy_id: str

The ID of the policy to update.

name: str

Cluster Policy name requested by the user. This has to be unique. Length must be between 1 and 100 characters.

definition: str | None = None

Policy definition document expressed in [Databricks Cluster Policy Definition Language].

[Databricks Cluster Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

description: str | None = None

Additional human-readable description of the cluster policy.

libraries: List[Library] | None = None

A list of libraries to be installed on the next cluster restart that uses this policy. The maximum number of libraries is 500.

max_clusters_per_user: int | None = None

Max number of clusters per user that can be active using this policy. If not present, there is no max limit.

policy_family_definition_overrides: str | None = None

Policy definition JSON document expressed in [Databricks Policy Definition Language]. The JSON document must be passed as a string and cannot be embedded in the requests.

You can use this to customize the policy definition inherited from the policy family. Policy rules specified here are merged into the inherited policy definition.

[Databricks Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

policy_family_id: str | None = None

ID of the policy family. The cluster policy’s policy definition inherits the policy family’s policy definition.

Cannot be used with definition. Use policy_family_definition_overrides instead to customize the policy definition.

as_dict() dict

Serializes the EditPolicy into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditPolicy

Deserializes the EditPolicy from a dictionary.

class databricks.sdk.service.compute.EditPolicyResponse
as_dict() dict

Serializes the EditPolicyResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditPolicyResponse

Deserializes the EditPolicyResponse from a dictionary.

class databricks.sdk.service.compute.EditResponse
as_dict() dict

Serializes the EditResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EditResponse

Deserializes the EditResponse from a dictionary.

class databricks.sdk.service.compute.Environment

The a environment entity used to preserve serverless environment side panel and jobs’ environment for non-notebook task. In this minimal environment spec, only pip dependencies are supported. Next ID: 5

client: str

Client version used by the environment The client is the user-facing environment of the runtime. Each client comes with a specific set of pre-installed libraries. The version is a string, consisting of the major client version.

dependencies: List[str] | None = None

List of pip dependencies, as supported by the version of pip in this environment. Each dependency is a pip requirement file line https://pip.pypa.io/en/stable/reference/requirements-file-format/ Allowed dependency could be <requirement specifier>, <archive url/path>, <local project path>(WSFS or Volumes in Databricks), <vcs project url> E.g. dependencies: [“foo==0.0.1”, “-r /Workspace/test/requirements.txt”]

as_dict() dict

Serializes the Environment into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Environment

Deserializes the Environment from a dictionary.

class databricks.sdk.service.compute.EventDetails
attributes: ClusterAttributes | None = None
  • For created clusters, the attributes of the cluster. * For edited clusters, the new attributes

of the cluster.

cause: EventDetailsCause | None = None

The cause of a change in target size.

cluster_size: ClusterSize | None = None

The actual cluster size that was set in the cluster creation or edit.

current_num_vcpus: int | None = None

The current number of vCPUs in the cluster.

current_num_workers: int | None = None

The current number of nodes in the cluster.

did_not_expand_reason: str | None = None

<needs content added>

disk_size: int | None = None

Current disk size in bytes

driver_state_message: str | None = None

More details about the change in driver’s state

enable_termination_for_node_blocklisted: bool | None = None

Whether or not a blocklisted node should be terminated. For ClusterEventType NODE_BLACKLISTED.

free_space: int | None = None

<needs content added>

init_scripts: InitScriptEventDetails | None = None

List of global and cluster init scripts associated with this cluster event.

instance_id: str | None = None

Instance Id where the event originated from

job_run_name: str | None = None

Unique identifier of the specific job run associated with this cluster event * For clusters created for jobs, this will be the same as the cluster name

previous_attributes: ClusterAttributes | None = None

The cluster attributes before a cluster was edited.

previous_cluster_size: ClusterSize | None = None

The size of the cluster before an edit or resize.

previous_disk_size: int | None = None

Previous disk size in bytes

reason: TerminationReason | None = None

A termination reason: * On a TERMINATED event, this is the reason of the termination. * On a RESIZE_COMPLETE event, this indicates the reason that we failed to acquire some nodes.

target_num_vcpus: int | None = None

The targeted number of vCPUs in the cluster.

target_num_workers: int | None = None

The targeted number of nodes in the cluster.

user: str | None = None

The user that caused the event to occur. (Empty if it was done by the control plane.)

as_dict() dict

Serializes the EventDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) EventDetails

Deserializes the EventDetails from a dictionary.

class databricks.sdk.service.compute.EventDetailsCause

The cause of a change in target size.

AUTORECOVERY = "AUTORECOVERY"
AUTOSCALE = "AUTOSCALE"
REPLACE_BAD_NODES = "REPLACE_BAD_NODES"
USER_REQUEST = "USER_REQUEST"
class databricks.sdk.service.compute.EventType
AUTOSCALING_STATS_REPORT = "AUTOSCALING_STATS_REPORT"
CREATING = "CREATING"
DBFS_DOWN = "DBFS_DOWN"
DID_NOT_EXPAND_DISK = "DID_NOT_EXPAND_DISK"
DRIVER_HEALTHY = "DRIVER_HEALTHY"
DRIVER_NOT_RESPONDING = "DRIVER_NOT_RESPONDING"
DRIVER_UNAVAILABLE = "DRIVER_UNAVAILABLE"
EDITED = "EDITED"
EXPANDED_DISK = "EXPANDED_DISK"
FAILED_TO_EXPAND_DISK = "FAILED_TO_EXPAND_DISK"
INIT_SCRIPTS_FINISHED = "INIT_SCRIPTS_FINISHED"
INIT_SCRIPTS_STARTED = "INIT_SCRIPTS_STARTED"
METASTORE_DOWN = "METASTORE_DOWN"
NODES_LOST = "NODES_LOST"
NODE_BLACKLISTED = "NODE_BLACKLISTED"
NODE_EXCLUDED_DECOMMISSIONED = "NODE_EXCLUDED_DECOMMISSIONED"
PINNED = "PINNED"
RESIZING = "RESIZING"
RESTARTING = "RESTARTING"
RUNNING = "RUNNING"
SPARK_EXCEPTION = "SPARK_EXCEPTION"
STARTING = "STARTING"
TERMINATING = "TERMINATING"
UNPINNED = "UNPINNED"
UPSIZE_COMPLETED = "UPSIZE_COMPLETED"
class databricks.sdk.service.compute.GcpAttributes
availability: GcpAvailability | None = None

This field determines whether the instance pool will contain preemptible VMs, on-demand VMs, or preemptible VMs with a fallback to on-demand VMs if the former is unavailable.

boot_disk_size: int | None = None

boot disk size in GB

google_service_account: str | None = None

If provided, the cluster will impersonate the google service account when accessing gcloud services (like GCS). The google service account must have previously been added to the Databricks environment by an account administrator.

local_ssd_count: int | None = None

If provided, each node (workers and driver) in the cluster will have this number of local SSDs attached. Each local SSD is 375GB in size. Refer to [GCP documentation] for the supported number of local SSDs for each instance type.

[GCP documentation]: https://cloud.google.com/compute/docs/disks/local-ssd#choose_number_local_ssds

use_preemptible_executors: bool | None = None

This field determines whether the spark executors will be scheduled to run on preemptible VMs (when set to true) versus standard compute engine VMs (when set to false; default). Note: Soon to be deprecated, use the availability field instead.

zone_id: str | None = None

Identifier for the availability zone in which the cluster resides. This can be one of the following: - “HA” => High availability, spread nodes across availability zones for a Databricks deployment region [default] - “AUTO” => Databricks picks an availability zone to schedule the cluster on. - A GCP availability zone => Pick One of the available zones for (machine type + region) from https://cloud.google.com/compute/docs/regions-zones.

as_dict() dict

Serializes the GcpAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GcpAttributes

Deserializes the GcpAttributes from a dictionary.

class databricks.sdk.service.compute.GcpAvailability

This field determines whether the instance pool will contain preemptible VMs, on-demand VMs, or preemptible VMs with a fallback to on-demand VMs if the former is unavailable.

ON_DEMAND_GCP = "ON_DEMAND_GCP"
PREEMPTIBLE_GCP = "PREEMPTIBLE_GCP"
PREEMPTIBLE_WITH_FALLBACK_GCP = "PREEMPTIBLE_WITH_FALLBACK_GCP"
class databricks.sdk.service.compute.GcsStorageInfo
destination: str

GCS destination/URI, e.g. gs://my-bucket/some-prefix

as_dict() dict

Serializes the GcsStorageInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GcsStorageInfo

Deserializes the GcsStorageInfo from a dictionary.

class databricks.sdk.service.compute.GetClusterPermissionLevelsResponse
permission_levels: List[ClusterPermissionsDescription] | None = None

Specific permission levels

as_dict() dict

Serializes the GetClusterPermissionLevelsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetClusterPermissionLevelsResponse

Deserializes the GetClusterPermissionLevelsResponse from a dictionary.

class databricks.sdk.service.compute.GetClusterPolicyPermissionLevelsResponse
permission_levels: List[ClusterPolicyPermissionsDescription] | None = None

Specific permission levels

as_dict() dict

Serializes the GetClusterPolicyPermissionLevelsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetClusterPolicyPermissionLevelsResponse

Deserializes the GetClusterPolicyPermissionLevelsResponse from a dictionary.

class databricks.sdk.service.compute.GetEvents
cluster_id: str

The ID of the cluster to retrieve events about.

end_time: int | None = None

The end time in epoch milliseconds. If empty, returns events up to the current time.

event_types: List[EventType] | None = None

An optional set of event types to filter on. If empty, all event types are returned.

limit: int | None = None

The maximum number of events to include in a page of events. Defaults to 50, and maximum allowed value is 500.

offset: int | None = None

The offset in the result set. Defaults to 0 (no offset). When an offset is specified and the results are requested in descending order, the end_time field is required.

order: GetEventsOrder | None = None

The order to list events in; either “ASC” or “DESC”. Defaults to “DESC”.

start_time: int | None = None

The start time in epoch milliseconds. If empty, returns events starting from the beginning of time.

as_dict() dict

Serializes the GetEvents into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetEvents

Deserializes the GetEvents from a dictionary.

class databricks.sdk.service.compute.GetEventsOrder

The order to list events in; either “ASC” or “DESC”. Defaults to “DESC”.

ASC = "ASC"
DESC = "DESC"
class databricks.sdk.service.compute.GetEventsResponse
events: List[ClusterEvent] | None = None

<content needs to be added>

next_page: GetEvents | None = None

The parameters required to retrieve the next page of events. Omitted if there are no more events to read.

total_count: int | None = None

The total number of events filtered by the start_time, end_time, and event_types.

as_dict() dict

Serializes the GetEventsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetEventsResponse

Deserializes the GetEventsResponse from a dictionary.

class databricks.sdk.service.compute.GetInstancePool
instance_pool_id: str

Canonical unique identifier for the pool.

aws_attributes: InstancePoolAwsAttributes | None = None

Attributes related to instance pools running on Amazon Web Services. If not specified at pool creation, a set of default values will be used.

azure_attributes: InstancePoolAzureAttributes | None = None

Attributes related to instance pools running on Azure. If not specified at pool creation, a set of default values will be used.

custom_tags: Dict[str, str] | None = None

Additional tags for pool resources. Databricks will tag all pool resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

default_tags: Dict[str, str] | None = None

Tags that are added by Databricks regardless of any custom_tags, including:

  • Vendor: Databricks

  • InstancePoolCreator: <user_id_of_creator>

  • InstancePoolName: <name_of_pool>

  • InstancePoolId: <id_of_pool>

disk_spec: DiskSpec | None = None

Defines the specification of the disks that will be attached to all spark containers.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this instances in this pool will dynamically acquire additional disk space when its Spark workers are running low on disk space. In AWS, this feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

gcp_attributes: InstancePoolGcpAttributes | None = None

Attributes related to instance pools running on Google Cloud Platform. If not specified at pool creation, a set of default values will be used.

idle_instance_autotermination_minutes: int | None = None

Automatically terminates the extra instances in the pool cache after they are inactive for this time in minutes if min_idle_instances requirement is already met. If not set, the extra pool instances will be automatically terminated after a default timeout. If specified, the threshold must be between 0 and 10000 minutes. Users can also set this value to 0 to instantly remove idle instances from the cache if min cache size could still hold.

instance_pool_name: str | None = None

Pool name requested by the user. Pool name must be unique. Length must be between 1 and 100 characters.

max_capacity: int | None = None

Maximum number of outstanding instances to keep in the pool, including both instances used by clusters and idle instances. Clusters that require further instance provisioning will fail during upsize requests.

min_idle_instances: int | None = None

Minimum number of idle instances to keep in the instance pool

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

preloaded_docker_images: List[DockerImage] | None = None

Custom Docker Image BYOC

preloaded_spark_versions: List[str] | None = None

A list containing at most one preloaded Spark image version for the pool. Pool-backed clusters started with the preloaded Spark version will start faster. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

state: InstancePoolState | None = None

Current state of the instance pool.

stats: InstancePoolStats | None = None

Usage statistics about the instance pool.

status: InstancePoolStatus | None = None

Status of failed pending instances in the pool.

as_dict() dict

Serializes the GetInstancePool into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetInstancePool

Deserializes the GetInstancePool from a dictionary.

class databricks.sdk.service.compute.GetInstancePoolPermissionLevelsResponse
permission_levels: List[InstancePoolPermissionsDescription] | None = None

Specific permission levels

as_dict() dict

Serializes the GetInstancePoolPermissionLevelsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetInstancePoolPermissionLevelsResponse

Deserializes the GetInstancePoolPermissionLevelsResponse from a dictionary.

class databricks.sdk.service.compute.GetSparkVersionsResponse
versions: List[SparkVersion] | None = None

All the available Spark versions.

as_dict() dict

Serializes the GetSparkVersionsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GetSparkVersionsResponse

Deserializes the GetSparkVersionsResponse from a dictionary.

class databricks.sdk.service.compute.GlobalInitScriptCreateRequest
name: str

The name of the script

script: str

The Base64-encoded content of the script.

enabled: bool | None = None

Specifies whether the script is enabled. The script runs only if enabled.

position: int | None = None

The position of a global init script, where 0 represents the first script to run, 1 is the second script to run, in ascending order.

If you omit the numeric position for a new global init script, it defaults to last position. It will run after all current scripts. Setting any value greater than the position of the last script is equivalent to the last position. Example: Take three existing scripts with positions 0, 1, and 2. Any position of (3) or greater puts the script in the last position. If an explicit position value conflicts with an existing script value, your request succeeds, but the original script at that position and all later scripts have their positions incremented by 1.

as_dict() dict

Serializes the GlobalInitScriptCreateRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GlobalInitScriptCreateRequest

Deserializes the GlobalInitScriptCreateRequest from a dictionary.

class databricks.sdk.service.compute.GlobalInitScriptDetails
created_at: int | None = None

Time when the script was created, represented as a Unix timestamp in milliseconds.

created_by: str | None = None

The username of the user who created the script.

enabled: bool | None = None

Specifies whether the script is enabled. The script runs only if enabled.

name: str | None = None

The name of the script

position: int | None = None

The position of a script, where 0 represents the first script to run, 1 is the second script to run, in ascending order.

script_id: str | None = None

The global init script ID.

updated_at: int | None = None

Time when the script was updated, represented as a Unix timestamp in milliseconds.

updated_by: str | None = None

The username of the user who last updated the script

as_dict() dict

Serializes the GlobalInitScriptDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GlobalInitScriptDetails

Deserializes the GlobalInitScriptDetails from a dictionary.

class databricks.sdk.service.compute.GlobalInitScriptDetailsWithContent
created_at: int | None = None

Time when the script was created, represented as a Unix timestamp in milliseconds.

created_by: str | None = None

The username of the user who created the script.

enabled: bool | None = None

Specifies whether the script is enabled. The script runs only if enabled.

name: str | None = None

The name of the script

position: int | None = None

The position of a script, where 0 represents the first script to run, 1 is the second script to run, in ascending order.

script: str | None = None

The Base64-encoded content of the script.

script_id: str | None = None

The global init script ID.

updated_at: int | None = None

Time when the script was updated, represented as a Unix timestamp in milliseconds.

updated_by: str | None = None

The username of the user who last updated the script

as_dict() dict

Serializes the GlobalInitScriptDetailsWithContent into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GlobalInitScriptDetailsWithContent

Deserializes the GlobalInitScriptDetailsWithContent from a dictionary.

class databricks.sdk.service.compute.GlobalInitScriptUpdateRequest
name: str

The name of the script

script: str

The Base64-encoded content of the script.

enabled: bool | None = None

Specifies whether the script is enabled. The script runs only if enabled.

position: int | None = None

The position of a script, where 0 represents the first script to run, 1 is the second script to run, in ascending order. To move the script to run first, set its position to 0.

To move the script to the end, set its position to any value greater or equal to the position of the last script. Example, three existing scripts with positions 0, 1, and 2. Any position value of 2 or greater puts the script in the last position (2).

If an explicit position value conflicts with an existing script, your request succeeds, but the original script at that position and all later scripts have their positions incremented by 1.

script_id: str | None = None

The ID of the global init script.

as_dict() dict

Serializes the GlobalInitScriptUpdateRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) GlobalInitScriptUpdateRequest

Deserializes the GlobalInitScriptUpdateRequest from a dictionary.

class databricks.sdk.service.compute.InitScriptEventDetails
cluster: List[InitScriptInfoAndExecutionDetails] | None = None

The cluster scoped init scripts associated with this cluster event

global_: List[InitScriptInfoAndExecutionDetails] | None = None

The global init scripts associated with this cluster event

reported_for_node: str | None = None

The private ip address of the node where the init scripts were run.

as_dict() dict

Serializes the InitScriptEventDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InitScriptEventDetails

Deserializes the InitScriptEventDetails from a dictionary.

class databricks.sdk.service.compute.InitScriptExecutionDetails
error_message: str | None = None

Addition details regarding errors.

execution_duration_seconds: int | None = None

The duration of the script execution in seconds.

status: InitScriptExecutionDetailsStatus | None = None

The current status of the script

as_dict() dict

Serializes the InitScriptExecutionDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InitScriptExecutionDetails

Deserializes the InitScriptExecutionDetails from a dictionary.

class databricks.sdk.service.compute.InitScriptExecutionDetailsStatus

The current status of the script

FAILED_EXECUTION = "FAILED_EXECUTION"
FAILED_FETCH = "FAILED_FETCH"
NOT_EXECUTED = "NOT_EXECUTED"
SKIPPED = "SKIPPED"
SUCCEEDED = "SUCCEEDED"
UNKNOWN = "UNKNOWN"
class databricks.sdk.service.compute.InitScriptInfo
abfss: Adlsgen2Info | None = None

destination needs to be provided. e.g. `{ “abfss” : { “destination” : “abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<directory-name>” } }

dbfs: DbfsStorageInfo | None = None

destination needs to be provided. e.g. { “dbfs” : { “destination” : “dbfs:/home/cluster_log” } }

file: LocalFileInfo | None = None

destination needs to be provided. e.g. { “file” : { “destination” : “file:/my/local/file.sh” } }

gcs: GcsStorageInfo | None = None

destination needs to be provided. e.g. { “gcs”: { “destination”: “gs://my-bucket/file.sh” } }

s3: S3StorageInfo | None = None

destination and either the region or endpoint need to be provided. e.g. { “s3”: { “destination” : “s3://cluster_log_bucket/prefix”, “region” : “us-west-2” } } Cluster iam role is used to access s3, please make sure the cluster iam role in instance_profile_arn has permission to write data to the s3 destination.

volumes: VolumesStorageInfo | None = None

destination needs to be provided. e.g. { “volumes” : { “destination” : “/Volumes/my-init.sh” } }

workspace: WorkspaceStorageInfo | None = None

destination needs to be provided. e.g. { “workspace” : { “destination” : “/Users/user1@databricks.com/my-init.sh” } }

as_dict() dict

Serializes the InitScriptInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InitScriptInfo

Deserializes the InitScriptInfo from a dictionary.

class databricks.sdk.service.compute.InitScriptInfoAndExecutionDetails
execution_details: InitScriptExecutionDetails | None = None

Details about the script

script: InitScriptInfo | None = None

The script

as_dict() dict

Serializes the InitScriptInfoAndExecutionDetails into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InitScriptInfoAndExecutionDetails

Deserializes the InitScriptInfoAndExecutionDetails from a dictionary.

class databricks.sdk.service.compute.InstallLibraries
cluster_id: str

Unique identifier for the cluster on which to install these libraries.

libraries: List[Library]

The libraries to install.

as_dict() dict

Serializes the InstallLibraries into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstallLibraries

Deserializes the InstallLibraries from a dictionary.

class databricks.sdk.service.compute.InstallLibrariesResponse
as_dict() dict

Serializes the InstallLibrariesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstallLibrariesResponse

Deserializes the InstallLibrariesResponse from a dictionary.

class databricks.sdk.service.compute.InstancePoolAccessControlRequest
group_name: str | None = None

name of the group

permission_level: InstancePoolPermissionLevel | None = None

Permission level

service_principal_name: str | None = None

application ID of a service principal

user_name: str | None = None

name of the user

as_dict() dict

Serializes the InstancePoolAccessControlRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolAccessControlRequest

Deserializes the InstancePoolAccessControlRequest from a dictionary.

class databricks.sdk.service.compute.InstancePoolAccessControlResponse
all_permissions: List[InstancePoolPermission] | None = None

All permissions.

display_name: str | None = None

Display name of the user or service principal.

group_name: str | None = None

name of the group

service_principal_name: str | None = None

Name of the service principal.

user_name: str | None = None

name of the user

as_dict() dict

Serializes the InstancePoolAccessControlResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolAccessControlResponse

Deserializes the InstancePoolAccessControlResponse from a dictionary.

class databricks.sdk.service.compute.InstancePoolAndStats
aws_attributes: InstancePoolAwsAttributes | None = None

Attributes related to instance pools running on Amazon Web Services. If not specified at pool creation, a set of default values will be used.

azure_attributes: InstancePoolAzureAttributes | None = None

Attributes related to instance pools running on Azure. If not specified at pool creation, a set of default values will be used.

custom_tags: Dict[str, str] | None = None

Additional tags for pool resources. Databricks will tag all pool resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

  • Currently, Databricks allows at most 45 custom tags

default_tags: Dict[str, str] | None = None

Tags that are added by Databricks regardless of any custom_tags, including:

  • Vendor: Databricks

  • InstancePoolCreator: <user_id_of_creator>

  • InstancePoolName: <name_of_pool>

  • InstancePoolId: <id_of_pool>

disk_spec: DiskSpec | None = None

Defines the specification of the disks that will be attached to all spark containers.

enable_elastic_disk: bool | None = None

Autoscaling Local Storage: when enabled, this instances in this pool will dynamically acquire additional disk space when its Spark workers are running low on disk space. In AWS, this feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

gcp_attributes: InstancePoolGcpAttributes | None = None

Attributes related to instance pools running on Google Cloud Platform. If not specified at pool creation, a set of default values will be used.

idle_instance_autotermination_minutes: int | None = None

Automatically terminates the extra instances in the pool cache after they are inactive for this time in minutes if min_idle_instances requirement is already met. If not set, the extra pool instances will be automatically terminated after a default timeout. If specified, the threshold must be between 0 and 10000 minutes. Users can also set this value to 0 to instantly remove idle instances from the cache if min cache size could still hold.

instance_pool_id: str | None = None

Canonical unique identifier for the pool.

instance_pool_name: str | None = None

Pool name requested by the user. Pool name must be unique. Length must be between 1 and 100 characters.

max_capacity: int | None = None

Maximum number of outstanding instances to keep in the pool, including both instances used by clusters and idle instances. Clusters that require further instance provisioning will fail during upsize requests.

min_idle_instances: int | None = None

Minimum number of idle instances to keep in the instance pool

node_type_id: str | None = None

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

preloaded_docker_images: List[DockerImage] | None = None

Custom Docker Image BYOC

preloaded_spark_versions: List[str] | None = None

A list containing at most one preloaded Spark image version for the pool. Pool-backed clusters started with the preloaded Spark version will start faster. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

state: InstancePoolState | None = None

Current state of the instance pool.

stats: InstancePoolStats | None = None

Usage statistics about the instance pool.

status: InstancePoolStatus | None = None

Status of failed pending instances in the pool.

as_dict() dict

Serializes the InstancePoolAndStats into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolAndStats

Deserializes the InstancePoolAndStats from a dictionary.

class databricks.sdk.service.compute.InstancePoolAwsAttributes
availability: InstancePoolAwsAttributesAvailability | None = None

Availability type used for the spot nodes.

The default value is defined by InstancePoolConf.instancePoolDefaultAwsAvailability

spot_bid_price_percent: int | None = None

Calculates the bid price for AWS spot instances, as a percentage of the corresponding instance type’s on-demand price. For example, if this field is set to 50, and the cluster needs a new r3.xlarge spot instance, then the bid price is half of the price of on-demand r3.xlarge instances. Similarly, if this field is set to 200, the bid price is twice the price of on-demand r3.xlarge instances. If not specified, the default value is 100. When spot instances are requested for this cluster, only spot instances whose bid price percentage matches this field will be considered. Note that, for safety, we enforce this field to be no more than 10000.

The default value and documentation here should be kept consistent with CommonConf.defaultSpotBidPricePercent and CommonConf.maxSpotBidPricePercent.

zone_id: str | None = None

Identifier for the availability zone/datacenter in which the cluster resides. This string will be of a form like “us-west-2a”. The provided availability zone must be in the same region as the Databricks deployment. For example, “us-west-2a” is not a valid zone id if the Databricks deployment resides in the “us-east-1” region. This is an optional field at cluster creation, and if not specified, a default zone will be used. The list of available zones as well as the default value can be found by using the List Zones method.

as_dict() dict

Serializes the InstancePoolAwsAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolAwsAttributes

Deserializes the InstancePoolAwsAttributes from a dictionary.

class databricks.sdk.service.compute.InstancePoolAwsAttributesAvailability

Availability type used for the spot nodes. The default value is defined by InstancePoolConf.instancePoolDefaultAwsAvailability

ON_DEMAND = "ON_DEMAND"
SPOT = "SPOT"
class databricks.sdk.service.compute.InstancePoolAzureAttributes
availability: InstancePoolAzureAttributesAvailability | None = None

Shows the Availability type used for the spot nodes.

The default value is defined by InstancePoolConf.instancePoolDefaultAzureAvailability

spot_bid_max_price: float | None = None

The default value and documentation here should be kept consistent with CommonConf.defaultSpotBidMaxPrice.

as_dict() dict

Serializes the InstancePoolAzureAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolAzureAttributes

Deserializes the InstancePoolAzureAttributes from a dictionary.

class databricks.sdk.service.compute.InstancePoolAzureAttributesAvailability

Shows the Availability type used for the spot nodes. The default value is defined by InstancePoolConf.instancePoolDefaultAzureAvailability

ON_DEMAND_AZURE = "ON_DEMAND_AZURE"
SPOT_AZURE = "SPOT_AZURE"
class databricks.sdk.service.compute.InstancePoolGcpAttributes
gcp_availability: GcpAvailability | None = None

This field determines whether the instance pool will contain preemptible VMs, on-demand VMs, or preemptible VMs with a fallback to on-demand VMs if the former is unavailable.

local_ssd_count: int | None = None

If provided, each node in the instance pool will have this number of local SSDs attached. Each local SSD is 375GB in size. Refer to [GCP documentation] for the supported number of local SSDs for each instance type.

[GCP documentation]: https://cloud.google.com/compute/docs/disks/local-ssd#choose_number_local_ssds

zone_id: str | None = None

Identifier for the availability zone/datacenter in which the cluster resides. This string will be of a form like “us-west1-a”. The provided availability zone must be in the same region as the Databricks workspace. For example, “us-west1-a” is not a valid zone id if the Databricks workspace resides in the “us-east1” region. This is an optional field at instance pool creation, and if not specified, a default zone will be used.

This field can be one of the following: - “HA” => High availability, spread nodes across availability zones for a Databricks deployment region - A GCP availability zone => Pick One of the available zones for (machine type + region) from https://cloud.google.com/compute/docs/regions-zones (e.g. “us-west1-a”).

If empty, Databricks picks an availability zone to schedule the cluster on.

as_dict() dict

Serializes the InstancePoolGcpAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolGcpAttributes

Deserializes the InstancePoolGcpAttributes from a dictionary.

class databricks.sdk.service.compute.InstancePoolPermission
inherited: bool | None = None
inherited_from_object: List[str] | None = None
permission_level: InstancePoolPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the InstancePoolPermission into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolPermission

Deserializes the InstancePoolPermission from a dictionary.

class databricks.sdk.service.compute.InstancePoolPermissionLevel

Permission level

CAN_ATTACH_TO = "CAN_ATTACH_TO"
CAN_MANAGE = "CAN_MANAGE"
class databricks.sdk.service.compute.InstancePoolPermissions
access_control_list: List[InstancePoolAccessControlResponse] | None = None
object_id: str | None = None
object_type: str | None = None
as_dict() dict

Serializes the InstancePoolPermissions into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolPermissions

Deserializes the InstancePoolPermissions from a dictionary.

class databricks.sdk.service.compute.InstancePoolPermissionsDescription
description: str | None = None
permission_level: InstancePoolPermissionLevel | None = None

Permission level

as_dict() dict

Serializes the InstancePoolPermissionsDescription into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolPermissionsDescription

Deserializes the InstancePoolPermissionsDescription from a dictionary.

class databricks.sdk.service.compute.InstancePoolPermissionsRequest
access_control_list: List[InstancePoolAccessControlRequest] | None = None
instance_pool_id: str | None = None

The instance pool for which to get or manage permissions.

as_dict() dict

Serializes the InstancePoolPermissionsRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolPermissionsRequest

Deserializes the InstancePoolPermissionsRequest from a dictionary.

class databricks.sdk.service.compute.InstancePoolState

Current state of the instance pool.

ACTIVE = "ACTIVE"
DELETED = "DELETED"
STOPPED = "STOPPED"
class databricks.sdk.service.compute.InstancePoolStats
idle_count: int | None = None

Number of active instances in the pool that are NOT part of a cluster.

pending_idle_count: int | None = None

Number of pending instances in the pool that are NOT part of a cluster.

pending_used_count: int | None = None

Number of pending instances in the pool that are part of a cluster.

used_count: int | None = None

Number of active instances in the pool that are part of a cluster.

as_dict() dict

Serializes the InstancePoolStats into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolStats

Deserializes the InstancePoolStats from a dictionary.

class databricks.sdk.service.compute.InstancePoolStatus
pending_instance_errors: List[PendingInstanceError] | None = None

List of error messages for the failed pending instances. The pending_instance_errors follows FIFO with maximum length of the min_idle of the pool. The pending_instance_errors is emptied once the number of exiting available instances reaches the min_idle of the pool.

as_dict() dict

Serializes the InstancePoolStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstancePoolStatus

Deserializes the InstancePoolStatus from a dictionary.

class databricks.sdk.service.compute.InstanceProfile
instance_profile_arn: str

The AWS ARN of the instance profile to register with Databricks. This field is required.

iam_role_arn: str | None = None

The AWS IAM role ARN of the role associated with the instance profile. This field is required if your role name and instance profile name do not match and you want to use the instance profile with [Databricks SQL Serverless].

Otherwise, this field is optional.

[Databricks SQL Serverless]: https://docs.databricks.com/sql/admin/serverless.html

is_meta_instance_profile: bool | None = None

Boolean flag indicating whether the instance profile should only be used in credential passthrough scenarios. If true, it means the instance profile contains an meta IAM role which could assume a wide range of roles. Therefore it should always be used with authorization. This field is optional, the default value is false.

as_dict() dict

Serializes the InstanceProfile into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) InstanceProfile

Deserializes the InstanceProfile from a dictionary.

class databricks.sdk.service.compute.Language
PYTHON = "PYTHON"
SCALA = "SCALA"
SQL = "SQL"
class databricks.sdk.service.compute.Library
cran: RCranLibrary | None = None

Specification of a CRAN library to be installed as part of the library

egg: str | None = None

URI of the egg library to install. Supported URIs include Workspace paths, Unity Catalog Volumes paths, and S3 URIs. For example: { “egg”: “/Workspace/path/to/library.egg” }, { “egg” : “/Volumes/path/to/library.egg” } or { “egg”: “s3://my-bucket/library.egg” }. If S3 is used, please make sure the cluster has read access on the library. You may need to launch the cluster with an IAM role to access the S3 URI.

jar: str | None = None

URI of the JAR library to install. Supported URIs include Workspace paths, Unity Catalog Volumes paths, and S3 URIs. For example: { “jar”: “/Workspace/path/to/library.jar” }, { “jar” : “/Volumes/path/to/library.jar” } or { “jar”: “s3://my-bucket/library.jar” }. If S3 is used, please make sure the cluster has read access on the library. You may need to launch the cluster with an IAM role to access the S3 URI.

maven: MavenLibrary | None = None

Specification of a maven library to be installed. For example: { “coordinates”: “org.jsoup:jsoup:1.7.2” }

pypi: PythonPyPiLibrary | None = None

Specification of a PyPi library to be installed. For example: { “package”: “simplejson” }

requirements: str | None = None

URI of the requirements.txt file to install. Only Workspace paths and Unity Catalog Volumes paths are supported. For example: { “requirements”: “/Workspace/path/to/requirements.txt” } or { “requirements” : “/Volumes/path/to/requirements.txt” }

whl: str | None = None

URI of the wheel library to install. Supported URIs include Workspace paths, Unity Catalog Volumes paths, and S3 URIs. For example: { “whl”: “/Workspace/path/to/library.whl” }, { “whl” : “/Volumes/path/to/library.whl” } or { “whl”: “s3://my-bucket/library.whl” }. If S3 is used, please make sure the cluster has read access on the library. You may need to launch the cluster with an IAM role to access the S3 URI.

as_dict() dict

Serializes the Library into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Library

Deserializes the Library from a dictionary.

class databricks.sdk.service.compute.LibraryFullStatus

The status of the library on a specific cluster.

is_library_for_all_clusters: bool | None = None

Whether the library was set to be installed on all clusters via the libraries UI.

library: Library | None = None

Unique identifier for the library.

messages: List[str] | None = None

All the info and warning messages that have occurred so far for this library.

status: LibraryInstallStatus | None = None

Status of installing the library on the cluster.

as_dict() dict

Serializes the LibraryFullStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) LibraryFullStatus

Deserializes the LibraryFullStatus from a dictionary.

class databricks.sdk.service.compute.LibraryInstallStatus

The status of a library on a specific cluster.

FAILED = "FAILED"
INSTALLED = "INSTALLED"
INSTALLING = "INSTALLING"
PENDING = "PENDING"
RESOLVING = "RESOLVING"
RESTORED = "RESTORED"
SKIPPED = "SKIPPED"
UNINSTALL_ON_RESTART = "UNINSTALL_ON_RESTART"
class databricks.sdk.service.compute.ListAllClusterLibraryStatusesResponse
statuses: List[ClusterLibraryStatuses] | None = None

A list of cluster statuses.

as_dict() dict

Serializes the ListAllClusterLibraryStatusesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListAllClusterLibraryStatusesResponse

Deserializes the ListAllClusterLibraryStatusesResponse from a dictionary.

class databricks.sdk.service.compute.ListAvailableZonesResponse
default_zone: str | None = None

The availability zone if no zone_id is provided in the cluster creation request.

zones: List[str] | None = None

The list of available zones (e.g., [‘us-west-2c’, ‘us-east-2’]).

as_dict() dict

Serializes the ListAvailableZonesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListAvailableZonesResponse

Deserializes the ListAvailableZonesResponse from a dictionary.

class databricks.sdk.service.compute.ListClustersResponse
clusters: List[ClusterDetails] | None = None

<needs content added>

as_dict() dict

Serializes the ListClustersResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListClustersResponse

Deserializes the ListClustersResponse from a dictionary.

class databricks.sdk.service.compute.ListGlobalInitScriptsResponse
scripts: List[GlobalInitScriptDetails] | None = None
as_dict() dict

Serializes the ListGlobalInitScriptsResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListGlobalInitScriptsResponse

Deserializes the ListGlobalInitScriptsResponse from a dictionary.

class databricks.sdk.service.compute.ListInstancePools
instance_pools: List[InstancePoolAndStats] | None = None
as_dict() dict

Serializes the ListInstancePools into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListInstancePools

Deserializes the ListInstancePools from a dictionary.

class databricks.sdk.service.compute.ListInstanceProfilesResponse
instance_profiles: List[InstanceProfile] | None = None

A list of instance profiles that the user can access.

as_dict() dict

Serializes the ListInstanceProfilesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListInstanceProfilesResponse

Deserializes the ListInstanceProfilesResponse from a dictionary.

class databricks.sdk.service.compute.ListNodeTypesResponse
node_types: List[NodeType] | None = None

The list of available Spark node types.

as_dict() dict

Serializes the ListNodeTypesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListNodeTypesResponse

Deserializes the ListNodeTypesResponse from a dictionary.

class databricks.sdk.service.compute.ListPoliciesResponse
policies: List[Policy] | None = None

List of policies.

as_dict() dict

Serializes the ListPoliciesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListPoliciesResponse

Deserializes the ListPoliciesResponse from a dictionary.

class databricks.sdk.service.compute.ListPolicyFamiliesResponse
policy_families: List[PolicyFamily]

List of policy families.

next_page_token: str | None = None

A token that can be used to get the next page of results. If not present, there are no more results to show.

as_dict() dict

Serializes the ListPolicyFamiliesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ListPolicyFamiliesResponse

Deserializes the ListPolicyFamiliesResponse from a dictionary.

class databricks.sdk.service.compute.ListSortColumn
POLICY_CREATION_TIME = "POLICY_CREATION_TIME"
POLICY_NAME = "POLICY_NAME"
class databricks.sdk.service.compute.ListSortOrder
ASC = "ASC"
DESC = "DESC"
class databricks.sdk.service.compute.LocalFileInfo
destination: str

local file destination, e.g. file:/my/local/file.sh

as_dict() dict

Serializes the LocalFileInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) LocalFileInfo

Deserializes the LocalFileInfo from a dictionary.

class databricks.sdk.service.compute.LogAnalyticsInfo
log_analytics_primary_key: str | None = None

<needs content added>

log_analytics_workspace_id: str | None = None

<needs content added>

as_dict() dict

Serializes the LogAnalyticsInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) LogAnalyticsInfo

Deserializes the LogAnalyticsInfo from a dictionary.

class databricks.sdk.service.compute.LogSyncStatus
last_attempted: int | None = None

The timestamp of last attempt. If the last attempt fails, last_exception will contain the exception in the last attempt.

last_exception: str | None = None

The exception thrown in the last attempt, it would be null (omitted in the response) if there is no exception in last attempted.

as_dict() dict

Serializes the LogSyncStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) LogSyncStatus

Deserializes the LogSyncStatus from a dictionary.

class databricks.sdk.service.compute.MavenLibrary
coordinates: str

Gradle-style maven coordinates. For example: “org.jsoup:jsoup:1.7.2”.

exclusions: List[str] | None = None

List of dependences to exclude. For example: [“slf4j:slf4j”, “*:hadoop-client”].

Maven dependency exclusions: https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html.

repo: str | None = None

Maven repo to install the Maven package from. If omitted, both Maven Central Repository and Spark Packages are searched.

as_dict() dict

Serializes the MavenLibrary into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) MavenLibrary

Deserializes the MavenLibrary from a dictionary.

class databricks.sdk.service.compute.NodeInstanceType
instance_type_id: str | None = None
local_disk_size_gb: int | None = None
local_disks: int | None = None
local_nvme_disk_size_gb: int | None = None
local_nvme_disks: int | None = None
as_dict() dict

Serializes the NodeInstanceType into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) NodeInstanceType

Deserializes the NodeInstanceType from a dictionary.

class databricks.sdk.service.compute.NodeType
node_type_id: str

Unique identifier for this node type.

memory_mb: int

Memory (in MB) available for this node type.

num_cores: float

Number of CPU cores available for this node type. Note that this can be fractional, e.g., 2.5 cores, if the the number of cores on a machine instance is not divisible by the number of Spark nodes on that machine.

description: str

A string description associated with this node type, e.g., “r3.xlarge”.

instance_type_id: str

An identifier for the type of hardware that this node runs on, e.g., “r3.2xlarge” in AWS.

category: str | None = None
display_order: int | None = None
is_deprecated: bool | None = None

Whether the node type is deprecated. Non-deprecated node types offer greater performance.

is_encrypted_in_transit: bool | None = None

AWS specific, whether this instance supports encryption in transit, used for hipaa and pci workloads.

is_graviton: bool | None = None
is_hidden: bool | None = None
is_io_cache_enabled: bool | None = None
node_info: CloudProviderNodeInfo | None = None
node_instance_type: NodeInstanceType | None = None
num_gpus: int | None = None
photon_driver_capable: bool | None = None
photon_worker_capable: bool | None = None
support_cluster_tags: bool | None = None
support_ebs_volumes: bool | None = None
support_port_forwarding: bool | None = None
supports_elastic_disk: bool | None = None

Indicates if this node type can be used for an instance pool or cluster with elastic disk enabled. This is true for most node types.

as_dict() dict

Serializes the NodeType into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) NodeType

Deserializes the NodeType from a dictionary.

class databricks.sdk.service.compute.PendingInstanceError
instance_id: str | None = None
message: str | None = None
as_dict() dict

Serializes the PendingInstanceError into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PendingInstanceError

Deserializes the PendingInstanceError from a dictionary.

class databricks.sdk.service.compute.PermanentDeleteCluster
cluster_id: str

The cluster to be deleted.

as_dict() dict

Serializes the PermanentDeleteCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PermanentDeleteCluster

Deserializes the PermanentDeleteCluster from a dictionary.

class databricks.sdk.service.compute.PermanentDeleteClusterResponse
as_dict() dict

Serializes the PermanentDeleteClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PermanentDeleteClusterResponse

Deserializes the PermanentDeleteClusterResponse from a dictionary.

class databricks.sdk.service.compute.PinCluster
cluster_id: str

<needs content added>

as_dict() dict

Serializes the PinCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PinCluster

Deserializes the PinCluster from a dictionary.

class databricks.sdk.service.compute.PinClusterResponse
as_dict() dict

Serializes the PinClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PinClusterResponse

Deserializes the PinClusterResponse from a dictionary.

class databricks.sdk.service.compute.Policy
created_at_timestamp: int | None = None

Creation time. The timestamp (in millisecond) when this Cluster Policy was created.

creator_user_name: str | None = None

Creator user name. The field won’t be included in the response if the user has already been deleted.

definition: str | None = None

Policy definition document expressed in [Databricks Cluster Policy Definition Language].

[Databricks Cluster Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

description: str | None = None

Additional human-readable description of the cluster policy.

is_default: bool | None = None

If true, policy is a default policy created and managed by <Databricks>. Default policies cannot be deleted, and their policy families cannot be changed.

libraries: List[Library] | None = None

A list of libraries to be installed on the next cluster restart that uses this policy. The maximum number of libraries is 500.

max_clusters_per_user: int | None = None

Max number of clusters per user that can be active using this policy. If not present, there is no max limit.

name: str | None = None

Cluster Policy name requested by the user. This has to be unique. Length must be between 1 and 100 characters.

policy_family_definition_overrides: str | None = None

Policy definition JSON document expressed in [Databricks Policy Definition Language]. The JSON document must be passed as a string and cannot be embedded in the requests.

You can use this to customize the policy definition inherited from the policy family. Policy rules specified here are merged into the inherited policy definition.

[Databricks Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

policy_family_id: str | None = None

ID of the policy family.

policy_id: str | None = None

Canonical unique identifier for the Cluster Policy.

as_dict() dict

Serializes the Policy into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Policy

Deserializes the Policy from a dictionary.

class databricks.sdk.service.compute.PolicyFamily
policy_family_id: str

ID of the policy family.

name: str

Name of the policy family.

description: str

Human-readable description of the purpose of the policy family.

definition: str

Policy definition document expressed in [Databricks Cluster Policy Definition Language].

[Databricks Cluster Policy Definition Language]: https://docs.databricks.com/administration-guide/clusters/policy-definition.html

as_dict() dict

Serializes the PolicyFamily into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PolicyFamily

Deserializes the PolicyFamily from a dictionary.

class databricks.sdk.service.compute.PythonPyPiLibrary
package: str

The name of the pypi package to install. An optional exact version specification is also supported. Examples: “simplejson” and “simplejson==3.8.0”.

repo: str | None = None

The repository where the package can be found. If not specified, the default pip index is used.

as_dict() dict

Serializes the PythonPyPiLibrary into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) PythonPyPiLibrary

Deserializes the PythonPyPiLibrary from a dictionary.

class databricks.sdk.service.compute.RCranLibrary
package: str

The name of the CRAN package to install.

repo: str | None = None

The repository where the package can be found. If not specified, the default CRAN repo is used.

as_dict() dict

Serializes the RCranLibrary into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RCranLibrary

Deserializes the RCranLibrary from a dictionary.

class databricks.sdk.service.compute.RemoveInstanceProfile
instance_profile_arn: str

The ARN of the instance profile to remove. This field is required.

as_dict() dict

Serializes the RemoveInstanceProfile into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RemoveInstanceProfile

Deserializes the RemoveInstanceProfile from a dictionary.

class databricks.sdk.service.compute.RemoveResponse
as_dict() dict

Serializes the RemoveResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RemoveResponse

Deserializes the RemoveResponse from a dictionary.

class databricks.sdk.service.compute.ResizeCluster
cluster_id: str

The cluster to be resized.

autoscale: AutoScale | None = None

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

num_workers: int | None = None

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.

as_dict() dict

Serializes the ResizeCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ResizeCluster

Deserializes the ResizeCluster from a dictionary.

class databricks.sdk.service.compute.ResizeClusterResponse
as_dict() dict

Serializes the ResizeClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) ResizeClusterResponse

Deserializes the ResizeClusterResponse from a dictionary.

class databricks.sdk.service.compute.RestartCluster
cluster_id: str

The cluster to be started.

restart_user: str | None = None

<needs content added>

as_dict() dict

Serializes the RestartCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RestartCluster

Deserializes the RestartCluster from a dictionary.

class databricks.sdk.service.compute.RestartClusterResponse
as_dict() dict

Serializes the RestartClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) RestartClusterResponse

Deserializes the RestartClusterResponse from a dictionary.

class databricks.sdk.service.compute.ResultType
ERROR = "ERROR"
IMAGE = "IMAGE"
IMAGES = "IMAGES"
TABLE = "TABLE"
TEXT = "TEXT"
class databricks.sdk.service.compute.Results
cause: str | None = None

The cause of the error

data: Any | None = None
file_name: str | None = None

The image filename

file_names: List[str] | None = None
is_json_schema: bool | None = None

true if a JSON schema is returned instead of a string representation of the Hive type.

pos: int | None = None

internal field used by SDK

result_type: ResultType | None = None
schema: List[Dict[str, Any]] | None = None

The table schema

summary: str | None = None

The summary of the error

truncated: bool | None = None

true if partial results are returned.

as_dict() dict

Serializes the Results into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) Results

Deserializes the Results from a dictionary.

class databricks.sdk.service.compute.RuntimeEngine

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

NULL = "NULL"
PHOTON = "PHOTON"
STANDARD = "STANDARD"
class databricks.sdk.service.compute.S3StorageInfo
destination: str

S3 destination, e.g. s3://my-bucket/some-prefix Note that logs will be delivered using cluster iam role, please make sure you set cluster iam role and the role has write access to the destination. Please also note that you cannot use AWS keys to deliver logs.

canned_acl: str | None = None

(Optional) Set canned access control list for the logs, e.g. bucket-owner-full-control. If canned_cal is set, please make sure the cluster iam role has s3:PutObjectAcl permission on the destination bucket and prefix. The full list of possible canned acl can be found at http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl. Please also note that by default only the object owner gets full controls. If you are using cross account role for writing data, you may want to set bucket-owner-full-control to make bucket owner able to read the logs.

enable_encryption: bool | None = None

(Optional) Flag to enable server side encryption, false by default.

encryption_type: str | None = None

(Optional) The encryption type, it could be sse-s3 or sse-kms. It will be used only when encryption is enabled and the default type is sse-s3.

endpoint: str | None = None

S3 endpoint, e.g. https://s3-us-west-2.amazonaws.com. Either region or endpoint needs to be set. If both are set, endpoint will be used.

kms_key: str | None = None

(Optional) Kms key which will be used if encryption is enabled and encryption type is set to sse-kms.

region: str | None = None

S3 region, e.g. us-west-2. Either region or endpoint needs to be set. If both are set, endpoint will be used.

as_dict() dict

Serializes the S3StorageInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) S3StorageInfo

Deserializes the S3StorageInfo from a dictionary.

class databricks.sdk.service.compute.SparkNode
host_private_ip: str | None = None

The private IP address of the host instance.

instance_id: str | None = None

Globally unique identifier for the host instance from the cloud provider.

node_aws_attributes: SparkNodeAwsAttributes | None = None

Attributes specific to AWS for a Spark node.

node_id: str | None = None

Globally unique identifier for this node.

private_ip: str | None = None

Private IP address (typically a 10.x.x.x address) of the Spark node. Note that this is different from the private IP address of the host instance.

public_dns: str | None = None

Public DNS address of this node. This address can be used to access the Spark JDBC server on the driver node. To communicate with the JDBC server, traffic must be manually authorized by adding security group rules to the “worker-unmanaged” security group via the AWS console.

Actually it’s the public DNS address of the host instance.

start_timestamp: int | None = None

The timestamp (in millisecond) when the Spark node is launched.

The start_timestamp is set right before the container is being launched. The timestamp when the container is placed on the ResourceManager, before its launch and setup by the NodeDaemon. This timestamp is the same as the creation timestamp in the database.

as_dict() dict

Serializes the SparkNode into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) SparkNode

Deserializes the SparkNode from a dictionary.

class databricks.sdk.service.compute.SparkNodeAwsAttributes
is_spot: bool | None = None

Whether this node is on an Amazon spot instance.

as_dict() dict

Serializes the SparkNodeAwsAttributes into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) SparkNodeAwsAttributes

Deserializes the SparkNodeAwsAttributes from a dictionary.

class databricks.sdk.service.compute.SparkVersion
key: str | None = None

Spark version key, for example “2.1.x-scala2.11”. This is the value which should be provided as the “spark_version” when creating a new cluster. Note that the exact Spark version may change over time for a “wildcard” version (i.e., “2.1.x-scala2.11” is a “wildcard” version) with minor bug fixes.

name: str | None = None

A descriptive name for this Spark version, for example “Spark 2.1”.

as_dict() dict

Serializes the SparkVersion into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) SparkVersion

Deserializes the SparkVersion from a dictionary.

class databricks.sdk.service.compute.StartCluster
cluster_id: str

The cluster to be started.

as_dict() dict

Serializes the StartCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) StartCluster

Deserializes the StartCluster from a dictionary.

class databricks.sdk.service.compute.StartClusterResponse
as_dict() dict

Serializes the StartClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) StartClusterResponse

Deserializes the StartClusterResponse from a dictionary.

class databricks.sdk.service.compute.State

Current state of the cluster.

ERROR = "ERROR"
PENDING = "PENDING"
RESIZING = "RESIZING"
RESTARTING = "RESTARTING"
RUNNING = "RUNNING"
TERMINATED = "TERMINATED"
TERMINATING = "TERMINATING"
UNKNOWN = "UNKNOWN"
class databricks.sdk.service.compute.TerminationReason
code: TerminationReasonCode | None = None

status code indicating why the cluster was terminated

parameters: Dict[str, str] | None = None

list of parameters that provide additional information about why the cluster was terminated

type: TerminationReasonType | None = None

type of the termination

as_dict() dict

Serializes the TerminationReason into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) TerminationReason

Deserializes the TerminationReason from a dictionary.

class databricks.sdk.service.compute.TerminationReasonCode

status code indicating why the cluster was terminated

ABUSE_DETECTED = "ABUSE_DETECTED"
ATTACH_PROJECT_FAILURE = "ATTACH_PROJECT_FAILURE"
AWS_AUTHORIZATION_FAILURE = "AWS_AUTHORIZATION_FAILURE"
AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE = "AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE"
AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE = "AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE"
AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE = "AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE"
AWS_REQUEST_LIMIT_EXCEEDED = "AWS_REQUEST_LIMIT_EXCEEDED"
AWS_UNSUPPORTED_FAILURE = "AWS_UNSUPPORTED_FAILURE"
AZURE_BYOK_KEY_PERMISSION_FAILURE = "AZURE_BYOK_KEY_PERMISSION_FAILURE"
AZURE_EPHEMERAL_DISK_FAILURE = "AZURE_EPHEMERAL_DISK_FAILURE"
AZURE_INVALID_DEPLOYMENT_TEMPLATE = "AZURE_INVALID_DEPLOYMENT_TEMPLATE"
AZURE_OPERATION_NOT_ALLOWED_EXCEPTION = "AZURE_OPERATION_NOT_ALLOWED_EXCEPTION"
AZURE_QUOTA_EXCEEDED_EXCEPTION = "AZURE_QUOTA_EXCEEDED_EXCEPTION"
AZURE_RESOURCE_MANAGER_THROTTLING = "AZURE_RESOURCE_MANAGER_THROTTLING"
AZURE_RESOURCE_PROVIDER_THROTTLING = "AZURE_RESOURCE_PROVIDER_THROTTLING"
AZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE = "AZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE"
AZURE_VM_EXTENSION_FAILURE = "AZURE_VM_EXTENSION_FAILURE"
AZURE_VNET_CONFIGURATION_FAILURE = "AZURE_VNET_CONFIGURATION_FAILURE"
BOOTSTRAP_TIMEOUT = "BOOTSTRAP_TIMEOUT"
BOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION = "BOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION"
CLOUD_PROVIDER_DISK_SETUP_FAILURE = "CLOUD_PROVIDER_DISK_SETUP_FAILURE"
CLOUD_PROVIDER_LAUNCH_FAILURE = "CLOUD_PROVIDER_LAUNCH_FAILURE"
CLOUD_PROVIDER_RESOURCE_STOCKOUT = "CLOUD_PROVIDER_RESOURCE_STOCKOUT"
CLOUD_PROVIDER_SHUTDOWN = "CLOUD_PROVIDER_SHUTDOWN"
COMMUNICATION_LOST = "COMMUNICATION_LOST"
CONTAINER_LAUNCH_FAILURE = "CONTAINER_LAUNCH_FAILURE"
CONTROL_PLANE_REQUEST_FAILURE = "CONTROL_PLANE_REQUEST_FAILURE"
DATABASE_CONNECTION_FAILURE = "DATABASE_CONNECTION_FAILURE"
DBFS_COMPONENT_UNHEALTHY = "DBFS_COMPONENT_UNHEALTHY"
DOCKER_IMAGE_PULL_FAILURE = "DOCKER_IMAGE_PULL_FAILURE"
DRIVER_UNREACHABLE = "DRIVER_UNREACHABLE"
DRIVER_UNRESPONSIVE = "DRIVER_UNRESPONSIVE"
EXECUTION_COMPONENT_UNHEALTHY = "EXECUTION_COMPONENT_UNHEALTHY"
GCP_QUOTA_EXCEEDED = "GCP_QUOTA_EXCEEDED"
GCP_SERVICE_ACCOUNT_DELETED = "GCP_SERVICE_ACCOUNT_DELETED"
GLOBAL_INIT_SCRIPT_FAILURE = "GLOBAL_INIT_SCRIPT_FAILURE"
HIVE_METASTORE_PROVISIONING_FAILURE = "HIVE_METASTORE_PROVISIONING_FAILURE"
IMAGE_PULL_PERMISSION_DENIED = "IMAGE_PULL_PERMISSION_DENIED"
INACTIVITY = "INACTIVITY"
INIT_SCRIPT_FAILURE = "INIT_SCRIPT_FAILURE"
INSTANCE_POOL_CLUSTER_FAILURE = "INSTANCE_POOL_CLUSTER_FAILURE"
INSTANCE_UNREACHABLE = "INSTANCE_UNREACHABLE"
INTERNAL_ERROR = "INTERNAL_ERROR"
INVALID_ARGUMENT = "INVALID_ARGUMENT"
INVALID_SPARK_IMAGE = "INVALID_SPARK_IMAGE"
IP_EXHAUSTION_FAILURE = "IP_EXHAUSTION_FAILURE"
JOB_FINISHED = "JOB_FINISHED"
K8S_AUTOSCALING_FAILURE = "K8S_AUTOSCALING_FAILURE"
K8S_DBR_CLUSTER_LAUNCH_TIMEOUT = "K8S_DBR_CLUSTER_LAUNCH_TIMEOUT"
METASTORE_COMPONENT_UNHEALTHY = "METASTORE_COMPONENT_UNHEALTHY"
NEPHOS_RESOURCE_MANAGEMENT = "NEPHOS_RESOURCE_MANAGEMENT"
NETWORK_CONFIGURATION_FAILURE = "NETWORK_CONFIGURATION_FAILURE"
NFS_MOUNT_FAILURE = "NFS_MOUNT_FAILURE"
NPIP_TUNNEL_SETUP_FAILURE = "NPIP_TUNNEL_SETUP_FAILURE"
NPIP_TUNNEL_TOKEN_FAILURE = "NPIP_TUNNEL_TOKEN_FAILURE"
REQUEST_REJECTED = "REQUEST_REJECTED"
REQUEST_THROTTLED = "REQUEST_THROTTLED"
SECRET_RESOLUTION_ERROR = "SECRET_RESOLUTION_ERROR"
SECURITY_DAEMON_REGISTRATION_EXCEPTION = "SECURITY_DAEMON_REGISTRATION_EXCEPTION"
SELF_BOOTSTRAP_FAILURE = "SELF_BOOTSTRAP_FAILURE"
SKIPPED_SLOW_NODES = "SKIPPED_SLOW_NODES"
SLOW_IMAGE_DOWNLOAD = "SLOW_IMAGE_DOWNLOAD"
SPARK_ERROR = "SPARK_ERROR"
SPARK_IMAGE_DOWNLOAD_FAILURE = "SPARK_IMAGE_DOWNLOAD_FAILURE"
SPARK_STARTUP_FAILURE = "SPARK_STARTUP_FAILURE"
SPOT_INSTANCE_TERMINATION = "SPOT_INSTANCE_TERMINATION"
STORAGE_DOWNLOAD_FAILURE = "STORAGE_DOWNLOAD_FAILURE"
STS_CLIENT_SETUP_FAILURE = "STS_CLIENT_SETUP_FAILURE"
SUBNET_EXHAUSTED_FAILURE = "SUBNET_EXHAUSTED_FAILURE"
TEMPORARILY_UNAVAILABLE = "TEMPORARILY_UNAVAILABLE"
TRIAL_EXPIRED = "TRIAL_EXPIRED"
UNEXPECTED_LAUNCH_FAILURE = "UNEXPECTED_LAUNCH_FAILURE"
UNKNOWN = "UNKNOWN"
UNSUPPORTED_INSTANCE_TYPE = "UNSUPPORTED_INSTANCE_TYPE"
UPDATE_INSTANCE_PROFILE_FAILURE = "UPDATE_INSTANCE_PROFILE_FAILURE"
USER_REQUEST = "USER_REQUEST"
WORKER_SETUP_FAILURE = "WORKER_SETUP_FAILURE"
WORKSPACE_CANCELLED_ERROR = "WORKSPACE_CANCELLED_ERROR"
WORKSPACE_CONFIGURATION_ERROR = "WORKSPACE_CONFIGURATION_ERROR"
class databricks.sdk.service.compute.TerminationReasonType

type of the termination

CLIENT_ERROR = "CLIENT_ERROR"
CLOUD_FAILURE = "CLOUD_FAILURE"
SERVICE_FAULT = "SERVICE_FAULT"
SUCCESS = "SUCCESS"
class databricks.sdk.service.compute.UninstallLibraries
cluster_id: str

Unique identifier for the cluster on which to uninstall these libraries.

libraries: List[Library]

The libraries to uninstall.

as_dict() dict

Serializes the UninstallLibraries into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UninstallLibraries

Deserializes the UninstallLibraries from a dictionary.

class databricks.sdk.service.compute.UninstallLibrariesResponse
as_dict() dict

Serializes the UninstallLibrariesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UninstallLibrariesResponse

Deserializes the UninstallLibrariesResponse from a dictionary.

class databricks.sdk.service.compute.UnpinCluster
cluster_id: str

<needs content added>

as_dict() dict

Serializes the UnpinCluster into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UnpinCluster

Deserializes the UnpinCluster from a dictionary.

class databricks.sdk.service.compute.UnpinClusterResponse
as_dict() dict

Serializes the UnpinClusterResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UnpinClusterResponse

Deserializes the UnpinClusterResponse from a dictionary.

class databricks.sdk.service.compute.UpdateResponse
as_dict() dict

Serializes the UpdateResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) UpdateResponse

Deserializes the UpdateResponse from a dictionary.

class databricks.sdk.service.compute.VolumesStorageInfo
destination: str

Unity Catalog Volumes file destination, e.g. /Volumes/my-init.sh

as_dict() dict

Serializes the VolumesStorageInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) VolumesStorageInfo

Deserializes the VolumesStorageInfo from a dictionary.

class databricks.sdk.service.compute.WorkloadType
clients: ClientsTypes

defined what type of clients can use the cluster. E.g. Notebooks, Jobs

as_dict() dict

Serializes the WorkloadType into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) WorkloadType

Deserializes the WorkloadType from a dictionary.

class databricks.sdk.service.compute.WorkspaceStorageInfo
destination: str

workspace files destination, e.g. /Users/user1@databricks.com/my-init.sh

as_dict() dict

Serializes the WorkspaceStorageInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) WorkspaceStorageInfo

Deserializes the WorkspaceStorageInfo from a dictionary.