Spark Declarative Pipelines¶
These dataclasses are used in the SDK to represent API requests and responses for services in the databricks.sdk.service.pipelines module.
- class databricks.sdk.service.pipelines.ApplyEnvironmentRequestResponse¶
- as_dict() dict¶
Serializes the ApplyEnvironmentRequestResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ApplyEnvironmentRequestResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ApplyEnvironmentRequestResponse¶
Deserializes the ApplyEnvironmentRequestResponse from a dictionary.
- class databricks.sdk.service.pipelines.AutoFullRefreshPolicy(enabled: bool, min_interval_hours: int | None = None)¶
Policy for auto full refresh.
- enabled: bool¶
(Required, Mutable) Whether to enable auto full refresh or not.
- min_interval_hours: int | None = None¶
(Optional, Mutable) Specify the minimum interval in hours between the timestamp at which a table was last full refreshed and the current timestamp for triggering auto full If unspecified and autoFullRefresh is enabled then by default min_interval_hours is 24 hours.
- as_dict() dict¶
Serializes the AutoFullRefreshPolicy into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the AutoFullRefreshPolicy into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) AutoFullRefreshPolicy¶
Deserializes the AutoFullRefreshPolicy from a dictionary.
- class databricks.sdk.service.pipelines.CloneMode¶
Enum to specify which mode of clone to execute
- MIGRATE_TO_UC = "MIGRATE_TO_UC"¶
- class databricks.sdk.service.pipelines.ClonePipelineResponse(pipeline_id: 'Optional[str]' = None)¶
- pipeline_id: str | None = None¶
The pipeline id of the cloned pipeline
- as_dict() dict¶
Serializes the ClonePipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ClonePipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ClonePipelineResponse¶
Deserializes the ClonePipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.ConfluenceConnectorOptions(include_confluence_spaces: List[str] | None = None)¶
Confluence specific options for ingestion
- include_confluence_spaces: List[str] | None = None¶
(Optional) Spaces to filter Confluence data on
- as_dict() dict¶
Serializes the ConfluenceConnectorOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ConfluenceConnectorOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ConfluenceConnectorOptions¶
Deserializes the ConfluenceConnectorOptions from a dictionary.
- class databricks.sdk.service.pipelines.ConnectionParameters(source_catalog: 'Optional[str]' = None)¶
- source_catalog: str | None = None¶
Source catalog for initial connection. This is necessary for schema exploration in some database systems like Oracle, and optional but nice-to-have in some other database systems like Postgres. For Oracle databases, this maps to a service name.
- as_dict() dict¶
Serializes the ConnectionParameters into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ConnectionParameters into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ConnectionParameters¶
Deserializes the ConnectionParameters from a dictionary.
- class databricks.sdk.service.pipelines.ConnectorOptions(confluence_options: ConfluenceConnectorOptions | None = None, gdrive_options: GoogleDriveOptions | None = None, google_ads_options: GoogleAdsOptions | None = None, jira_options: JiraConnectorOptions | None = None, kafka_options: KafkaOptions | None = None, meta_ads_options: MetaMarketingOptions | None = None, outlook_options: OutlookOptions | None = None, sharepoint_options: SharepointOptions | None = None, smartsheet_options: SmartsheetOptions | None = None, tiktok_ads_options: TikTokAdsOptions | None = None, zendesk_support_options: ZendeskSupportOptions | None = None)¶
Wrapper message for source-specific options to support multiple connector types
- confluence_options: ConfluenceConnectorOptions | None = None¶
- gdrive_options: GoogleDriveOptions | None = None¶
- google_ads_options: GoogleAdsOptions | None = None¶
- jira_options: JiraConnectorOptions | None = None¶
- kafka_options: KafkaOptions | None = None¶
- meta_ads_options: MetaMarketingOptions | None = None¶
- outlook_options: OutlookOptions | None = None¶
- smartsheet_options: SmartsheetOptions | None = None¶
- tiktok_ads_options: TikTokAdsOptions | None = None¶
- zendesk_support_options: ZendeskSupportOptions | None = None¶
- as_dict() dict¶
Serializes the ConnectorOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ConnectorOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ConnectorOptions¶
Deserializes the ConnectorOptions from a dictionary.
- class databricks.sdk.service.pipelines.ConnectorType¶
For certain database sources LakeFlow Connect offers both query based and cdc ingestion, ConnectorType can bse used to convey the type of ingestion. If connection_name is provided for database sources, we default to Query Based ingestion
- CDC = "CDC"¶
- QUERY_BASED = "QUERY_BASED"¶
- class databricks.sdk.service.pipelines.CreatePipelineResponse(effective_settings: 'Optional[PipelineSpec]' = None, pipeline_id: 'Optional[str]' = None)¶
- effective_settings: PipelineSpec | None = None¶
Only returned when dry_run is true.
- pipeline_id: str | None = None¶
The unique identifier for the newly created pipeline. Only returned when dry_run is false.
- as_dict() dict¶
Serializes the CreatePipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the CreatePipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) CreatePipelineResponse¶
Deserializes the CreatePipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.CronTrigger(quartz_cron_schedule: 'Optional[str]' = None, timezone_id: 'Optional[str]' = None)¶
- quartz_cron_schedule: str | None = None¶
- timezone_id: str | None = None¶
- as_dict() dict¶
Serializes the CronTrigger into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the CronTrigger into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) CronTrigger¶
Deserializes the CronTrigger from a dictionary.
- class databricks.sdk.service.pipelines.DataPlaneId(instance: 'Optional[str]' = None, seq_no: 'Optional[int]' = None)¶
- instance: str | None = None¶
The instance name of the data plane emitting an event.
- seq_no: int | None = None¶
A sequence number, unique and increasing within the data plane instance.
- as_dict() dict¶
Serializes the DataPlaneId into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the DataPlaneId into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) DataPlaneId¶
Deserializes the DataPlaneId from a dictionary.
- class databricks.sdk.service.pipelines.DataStagingOptions(catalog_name: str, schema_name: str, volume_name: str | None = None)¶
Location of staged data storage
- catalog_name: str¶
(Required, Immutable) The name of the catalog for the connector’s staging storage location.
- schema_name: str¶
(Required, Immutable) The name of the schema for the connector’s staging storage location.
- volume_name: str | None = None¶
(Optional) The Unity Catalog-compatible name for the storage location. This is the volume to use for the data that is extracted by the connector. Spark Declarative Pipelines system will automatically create the volume under the catalog and schema. For Combined Cdc Managed Ingestion pipelines default name for the volume would be : __databricks_ingestion_gateway_staging_data-$pipelineId
- as_dict() dict¶
Serializes the DataStagingOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the DataStagingOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) DataStagingOptions¶
Deserializes the DataStagingOptions from a dictionary.
- class databricks.sdk.service.pipelines.DayOfWeek¶
Days of week in which the window is allowed to happen. If not specified all days of the week will be used.
- FRIDAY = "FRIDAY"¶
- MONDAY = "MONDAY"¶
- SATURDAY = "SATURDAY"¶
- SUNDAY = "SUNDAY"¶
- THURSDAY = "THURSDAY"¶
- TUESDAY = "TUESDAY"¶
- WEDNESDAY = "WEDNESDAY"¶
- class databricks.sdk.service.pipelines.DeletePipelineResponse¶
- as_dict() dict¶
Serializes the DeletePipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the DeletePipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) DeletePipelineResponse¶
Deserializes the DeletePipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.DeploymentKind¶
The deployment method that manages the pipeline: - BUNDLE: The pipeline is managed by a Databricks Asset Bundle.
- BUNDLE = "BUNDLE"¶
- class databricks.sdk.service.pipelines.EditPipelineResponse¶
- as_dict() dict¶
Serializes the EditPipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the EditPipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) EditPipelineResponse¶
Deserializes the EditPipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.ErrorDetail(exceptions: 'Optional[List[SerializedException]]' = None, fatal: 'Optional[bool]' = None)¶
- exceptions: List[SerializedException] | None = None¶
The exception thrown for this error, with its chain of cause.
- fatal: bool | None = None¶
Whether this error is considered fatal, that is, unrecoverable.
- as_dict() dict¶
Serializes the ErrorDetail into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ErrorDetail into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ErrorDetail¶
Deserializes the ErrorDetail from a dictionary.
- class databricks.sdk.service.pipelines.EventLevel¶
The severity level of the event.
- ERROR = "ERROR"¶
- INFO = "INFO"¶
- METRICS = "METRICS"¶
- WARN = "WARN"¶
- class databricks.sdk.service.pipelines.EventLogSpec(catalog: str | None = None, name: str | None = None, schema: str | None = None)¶
Configurable event log parameters.
- catalog: str | None = None¶
The UC catalog the event log is published under.
- name: str | None = None¶
The name the event log is published to in UC.
- schema: str | None = None¶
The UC schema the event log is published under.
- as_dict() dict¶
Serializes the EventLogSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the EventLogSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) EventLogSpec¶
Deserializes the EventLogSpec from a dictionary.
- class databricks.sdk.service.pipelines.FileFilter(modified_after: 'Optional[str]' = None, modified_before: 'Optional[str]' = None, path_filter: 'Optional[str]' = None)¶
- modified_after: str | None = None¶
Include files with modification times occurring after the specified time. Timestamp format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00) Based on https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html#modification-time-path-filters
- modified_before: str | None = None¶
Include files with modification times occurring before the specified time. Timestamp format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00) Based on https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html#modification-time-path-filters
- path_filter: str | None = None¶
Include files with file names matching the pattern Based on https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html#path-glob-filter
- as_dict() dict¶
Serializes the FileFilter into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the FileFilter into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) FileFilter¶
Deserializes the FileFilter from a dictionary.
- class databricks.sdk.service.pipelines.FileIngestionOptions(corrupt_record_column: 'Optional[str]' = None, file_filters: 'Optional[List[FileFilter]]' = None, format: 'Optional[FileIngestionOptionsFileFormat]' = None, format_options: 'Optional[Dict[str, str]]' = None, ignore_corrupt_files: 'Optional[bool]' = None, infer_column_types: 'Optional[bool]' = None, reader_case_sensitive: 'Optional[bool]' = None, rescued_data_column: 'Optional[str]' = None, schema_evolution_mode: 'Optional[FileIngestionOptionsSchemaEvolutionMode]' = None, schema_hints: 'Optional[str]' = None, single_variant_column: 'Optional[str]' = None)¶
- corrupt_record_column: str | None = None¶
- file_filters: List[FileFilter] | None = None¶
Generic options
- format: FileIngestionOptionsFileFormat | None = None¶
required for TableSpec
- format_options: Dict[str, str] | None = None¶
Format-specific options Based on https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/options#file-format-options
- ignore_corrupt_files: bool | None = None¶
- infer_column_types: bool | None = None¶
- reader_case_sensitive: bool | None = None¶
Column name case sensitivity https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/schema#change-case-sensitive-behavior
- rescued_data_column: str | None = None¶
- schema_evolution_mode: FileIngestionOptionsSchemaEvolutionMode | None = None¶
- schema_hints: str | None = None¶
Override inferred schema of specific columns Based on https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/schema#override-schema-inference-with-schema-hints
- single_variant_column: str | None = None¶
- as_dict() dict¶
Serializes the FileIngestionOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the FileIngestionOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) FileIngestionOptions¶
Deserializes the FileIngestionOptions from a dictionary.
- class databricks.sdk.service.pipelines.FileIngestionOptionsFileFormat¶
- AVRO = "AVRO"¶
- BINARYFILE = "BINARYFILE"¶
- CSV = "CSV"¶
- EXCEL = "EXCEL"¶
- JSON = "JSON"¶
- ORC = "ORC"¶
- PARQUET = "PARQUET"¶
- XML = "XML"¶
- class databricks.sdk.service.pipelines.FileIngestionOptionsSchemaEvolutionMode¶
-
- ADD_NEW_COLUMNS = "ADD_NEW_COLUMNS"¶
- ADD_NEW_COLUMNS_WITH_TYPE_WIDENING = "ADD_NEW_COLUMNS_WITH_TYPE_WIDENING"¶
- FAIL_ON_NEW_COLUMNS = "FAIL_ON_NEW_COLUMNS"¶
- NONE = "NONE"¶
- RESCUE = "RESCUE"¶
- class databricks.sdk.service.pipelines.FileLibrary(path: 'Optional[str]' = None)¶
- path: str | None = None¶
The absolute path of the source code.
- as_dict() dict¶
Serializes the FileLibrary into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the FileLibrary into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) FileLibrary¶
Deserializes the FileLibrary from a dictionary.
- class databricks.sdk.service.pipelines.Filters(exclude: 'Optional[List[str]]' = None, include: 'Optional[List[str]]' = None)¶
- exclude: List[str] | None = None¶
Paths to exclude.
- include: List[str] | None = None¶
Paths to include.
- as_dict() dict¶
Serializes the Filters into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Filters into a shallow dictionary of its immediate attributes.
- class databricks.sdk.service.pipelines.GetPipelinePermissionLevelsResponse(permission_levels: 'Optional[List[PipelinePermissionsDescription]]' = None)¶
- permission_levels: List[PipelinePermissionsDescription] | None = None¶
Specific permission levels
- as_dict() dict¶
Serializes the GetPipelinePermissionLevelsResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GetPipelinePermissionLevelsResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GetPipelinePermissionLevelsResponse¶
Deserializes the GetPipelinePermissionLevelsResponse from a dictionary.
- class databricks.sdk.service.pipelines.GetPipelineResponse(cause: 'Optional[str]' = None, cluster_id: 'Optional[str]' = None, creator_user_name: 'Optional[str]' = None, effective_budget_policy_id: 'Optional[str]' = None, effective_publishing_mode: 'Optional[PublishingMode]' = None, health: 'Optional[GetPipelineResponseHealth]' = None, last_modified: 'Optional[int]' = None, latest_updates: 'Optional[List[UpdateStateInfo]]' = None, name: 'Optional[str]' = None, parameters: 'Optional[Dict[str, str]]' = None, pipeline_id: 'Optional[str]' = None, run_as: 'Optional[RunAs]' = None, run_as_user_name: 'Optional[str]' = None, spec: 'Optional[PipelineSpec]' = None, state: 'Optional[PipelineState]' = None)¶
- cause: str | None = None¶
An optional message detailing the cause of the pipeline state.
- cluster_id: str | None = None¶
The ID of the cluster that the pipeline is running on.
- creator_user_name: str | None = None¶
The username of the pipeline creator.
- effective_budget_policy_id: str | None = None¶
Serverless budget policy ID of this pipeline.
- effective_publishing_mode: PublishingMode | None = None¶
Publishing mode of the pipeline
- health: GetPipelineResponseHealth | None = None¶
The health of a pipeline.
- last_modified: int | None = None¶
The last time the pipeline settings were modified or created.
- latest_updates: List[UpdateStateInfo] | None = None¶
Status of the latest updates for the pipeline. Ordered with the newest update first.
- name: str | None = None¶
A human friendly identifier for the pipeline, taken from the spec.
- parameters: Dict[str, str] | None = None¶
Key/value map of default parameters to use for pipeline execution. Maximum total size: 10k characters (JSON format)
- pipeline_id: str | None = None¶
The ID of the pipeline.
- run_as: RunAs | None = None¶
The user or service principal that the pipeline runs as, if specified in the request. This field indicates the explicit configuration of run_as for the pipeline. To find the value in all cases, explicit or implicit, use run_as_user_name.
- run_as_user_name: str | None = None¶
Username of the user that the pipeline will run on behalf of.
- spec: PipelineSpec | None = None¶
The pipeline specification. This field is not returned when called by ListPipelines.
- state: PipelineState | None = None¶
The pipeline state.
- as_dict() dict¶
Serializes the GetPipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GetPipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GetPipelineResponse¶
Deserializes the GetPipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.GetPipelineResponseHealth¶
The health of a pipeline.
- HEALTHY = "HEALTHY"¶
- UNHEALTHY = "UNHEALTHY"¶
- class databricks.sdk.service.pipelines.GetUpdateResponse(update: 'Optional[UpdateInfo]' = None)¶
- update: UpdateInfo | None = None¶
The current update info.
- as_dict() dict¶
Serializes the GetUpdateResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GetUpdateResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GetUpdateResponse¶
Deserializes the GetUpdateResponse from a dictionary.
- class databricks.sdk.service.pipelines.GoogleAdsConfig(manager_account_id: 'Optional[str]' = None)¶
- manager_account_id: str | None = None¶
(Required) Manager Account ID (also called MCC Account ID) used to list and access customer accounts under this manager account. This is required for fetching the list of customer accounts during source selection. If the same field is also set in the object-level GoogleAdsOptions (connector_options), the object-level value takes precedence over this top-level config.
- as_dict() dict¶
Serializes the GoogleAdsConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GoogleAdsConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GoogleAdsConfig¶
Deserializes the GoogleAdsConfig from a dictionary.
- class databricks.sdk.service.pipelines.GoogleAdsOptions(manager_account_id: str, lookback_window_days: int | None = None, sync_start_date: str | None = None)¶
Google Ads specific options for ingestion (object-level). When set, these values override the corresponding fields in GoogleAdsConfig (source_configurations).
- manager_account_id: str¶
(Optional at this level) Manager Account ID (also called MCC Account ID) used to list and access customer accounts under this manager account. Overrides GoogleAdsConfig.manager_account_id from source_configurations when set.
- lookback_window_days: int | None = None¶
(Optional) Number of days to look back for report tables to capture late-arriving data. If not specified, defaults to 30 days.
- sync_start_date: str | None = None¶
(Optional) Start date for the initial sync of report tables in YYYY-MM-DD format. This determines the earliest date from which to sync historical data. If not specified, defaults to 2 years of historical data.
- as_dict() dict¶
Serializes the GoogleAdsOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GoogleAdsOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GoogleAdsOptions¶
Deserializes the GoogleAdsOptions from a dictionary.
- class databricks.sdk.service.pipelines.GoogleDriveOptions(entity_type: 'Optional[GoogleDriveOptionsGoogleDriveEntityType]' = None, file_ingestion_options: 'Optional[FileIngestionOptions]' = None, url: 'Optional[str]' = None)¶
- entity_type: GoogleDriveOptionsGoogleDriveEntityType | None = None¶
- file_ingestion_options: FileIngestionOptions | None = None¶
- url: str | None = None¶
Google Drive URL.
- as_dict() dict¶
Serializes the GoogleDriveOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the GoogleDriveOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) GoogleDriveOptions¶
Deserializes the GoogleDriveOptions from a dictionary.
- class databricks.sdk.service.pipelines.GoogleDriveOptionsGoogleDriveEntityType¶
- FILE = "FILE"¶
- FILE_METADATA = "FILE_METADATA"¶
- PERMISSION = "PERMISSION"¶
- class databricks.sdk.service.pipelines.IngestionConfig(report: 'Optional[ReportSpec]' = None, schema: 'Optional[SchemaSpec]' = None, table: 'Optional[TableSpec]' = None)¶
- report: ReportSpec | None = None¶
Select a specific source report.
- schema: SchemaSpec | None = None¶
Select all tables from a specific source schema.
- as_dict() dict¶
Serializes the IngestionConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionConfig¶
Deserializes the IngestionConfig from a dictionary.
- class databricks.sdk.service.pipelines.IngestionGatewayPipelineDefinition(connection_name: 'str', gateway_storage_catalog: 'str', gateway_storage_schema: 'str', connection_id: 'Optional[str]' = None, connection_parameters: 'Optional[ConnectionParameters]' = None, gateway_storage_name: 'Optional[str]' = None)¶
- connection_name: str¶
Immutable. The Unity Catalog connection that this gateway pipeline uses to communicate with the source.
- gateway_storage_catalog: str¶
Required, Immutable. The name of the catalog for the gateway pipeline’s storage location.
- gateway_storage_schema: str¶
Required, Immutable. The name of the schema for the gateway pipelines’s storage location.
- connection_id: str | None = None¶
[Deprecated, use connection_name instead] Immutable. The Unity Catalog connection that this gateway pipeline uses to communicate with the source.
- connection_parameters: ConnectionParameters | None = None¶
Optional, Internal. Parameters required to establish an initial connection with the source.
- gateway_storage_name: str | None = None¶
Optional. The Unity Catalog-compatible name for the gateway storage location. This is the destination to use for the data that is extracted by the gateway. Spark Declarative Pipelines system will automatically create the storage location under the catalog and schema.
- as_dict() dict¶
Serializes the IngestionGatewayPipelineDefinition into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionGatewayPipelineDefinition into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionGatewayPipelineDefinition¶
Deserializes the IngestionGatewayPipelineDefinition from a dictionary.
- class databricks.sdk.service.pipelines.IngestionPipelineDefinition(connection_name: 'Optional[str]' = None, connector_type: 'Optional[ConnectorType]' = None, data_staging_options: 'Optional[DataStagingOptions]' = None, full_refresh_window: 'Optional[OperationTimeWindow]' = None, ingest_from_uc_foreign_catalog: 'Optional[bool]' = None, ingestion_gateway_id: 'Optional[str]' = None, netsuite_jar_path: 'Optional[str]' = None, objects: 'Optional[List[IngestionConfig]]' = None, source_configurations: 'Optional[List[SourceConfig]]' = None, source_type: 'Optional[IngestionSourceType]' = None, table_configuration: 'Optional[TableSpecificConfig]' = None)¶
- connection_name: str | None = None¶
The Unity Catalog connection that this ingestion pipeline uses to communicate with the source. This is used with both connectors for applications like Salesforce, Workday, and so on, and also database connectors like Oracle, (connector_type = QUERY_BASED OR connector_type = CDC). If connection name corresponds to database connectors like Oracle, and connector_type is not provided then connector_type defaults to QUERY_BASED. If connector_type is passed as CDC we use Combined Cdc Managed Ingestion pipeline. Under certain conditions, this can be replaced with ingestion_gateway_id to change the connector to Cdc Managed Ingestion Pipeline with Gateway pipeline.
- connector_type: ConnectorType | None = None¶
(Optional) Connector Type for sources. Ex: CDC, Query Based.
- data_staging_options: DataStagingOptions | None = None¶
(Optional) Location of staged data storage. This is required for migration from Cdc Managed Ingestion Pipeline with Gateway pipeline to Combined Cdc Managed Ingestion Pipeline. If not specified, the volume for staged data will be created in catalog and schema/target specified in the top level pipeline definition.
- full_refresh_window: OperationTimeWindow | None = None¶
(Optional) A window that specifies a set of time ranges for snapshot queries in CDC.
- ingest_from_uc_foreign_catalog: bool | None = None¶
Immutable. If set to true, the pipeline will ingest tables from the UC foreign catalogs directly without the need to specify a UC connection or ingestion gateway. The source_catalog fields in objects of IngestionConfig are interpreted as the UC foreign catalogs to ingest from.
- ingestion_gateway_id: str | None = None¶
Identifier for the gateway that is used by this ingestion pipeline to communicate with the source database. This is used with CDC connectors to databases like SQL Server using a gateway pipeline (connector_type = CDC). Under certain conditions, this can be replaced with connection_name to change the connector to Combined Cdc Managed Ingestion Pipeline.
- netsuite_jar_path: str | None = None¶
Netsuite only configuration. When the field is set for a netsuite connector, the jar stored in the field will be validated and added to the classpath of pipeline’s cluster.
- objects: List[IngestionConfig] | None = None¶
Required. Settings specifying tables to replicate and the destination for the replicated tables.
- source_configurations: List[SourceConfig] | None = None¶
Top-level source configurations
- source_type: IngestionSourceType | None = None¶
The type of the foreign source. The source type will be inferred from the source connection or ingestion gateway. This field is output only and will be ignored if provided.
- table_configuration: TableSpecificConfig | None = None¶
Configuration settings to control the ingestion of tables. These settings are applied to all tables in the pipeline.
- as_dict() dict¶
Serializes the IngestionPipelineDefinition into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionPipelineDefinition into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionPipelineDefinition¶
Deserializes the IngestionPipelineDefinition from a dictionary.
- class databricks.sdk.service.pipelines.IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig(cursor_columns: List[str] | None = None, deletion_condition: str | None = None, hard_deletion_sync_min_interval_in_seconds: int | None = None)¶
Configurations that are only applicable for query-based ingestion connectors.
- cursor_columns: List[str] | None = None¶
The names of the monotonically increasing columns in the source table that are used to enable the table to be read and ingested incrementally through structured streaming. The columns are allowed to have repeated values but have to be non-decreasing. If the source data is merged into the destination (e.g., using SCD Type 1 or Type 2), these columns will implicitly define the sequence_by behavior. You can still explicitly set sequence_by to override this default.
- deletion_condition: str | None = None¶
Specifies a SQL WHERE condition that specifies that the source row has been deleted. This is sometimes referred to as “soft-deletes”. For example: “Operation = ‘DELETE’” or “is_deleted = true”. This field is orthogonal to hard_deletion_sync_interval_in_seconds, one for soft-deletes and the other for hard-deletes. See also the hard_deletion_sync_min_interval_in_seconds field for handling of “hard deletes” where the source rows are physically removed from the table.
- hard_deletion_sync_min_interval_in_seconds: int | None = None¶
Specifies the minimum interval (in seconds) between snapshots on primary keys for detecting and synchronizing hard deletions—i.e., rows that have been physically removed from the source table. This interval acts as a lower bound. If ingestion runs less frequently than this value, hard deletion synchronization will align with the actual ingestion frequency instead of happening more often. If not set, hard deletion synchronization via snapshots is disabled. This field is mutable and can be updated without triggering a full snapshot.
- as_dict() dict¶
Serializes the IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig¶
Deserializes the IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig from a dictionary.
- class databricks.sdk.service.pipelines.IngestionPipelineDefinitionWorkdayReportParameters(incremental: 'Optional[bool]' = None, parameters: 'Optional[Dict[str, str]]' = None, report_parameters: 'Optional[List[IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue]]' = None)¶
- incremental: bool | None = None¶
(Optional) Marks the report as incremental. This field is deprecated and should not be used. Use parameters instead. The incremental behavior is now controlled by the parameters field.
- parameters: Dict[str, str] | None = None¶
Parameters for the Workday report. Each key represents the parameter name (e.g., “start_date”, “end_date”), and the corresponding value is a SQL-like expression used to compute the parameter value at runtime. Example: { “start_date”: “{ coalesce(current_offset(), date(“2025-02-01”)) }”, “end_date”: “{ current_date() - INTERVAL 1 DAY }” }
- report_parameters: List[IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue] | None = None¶
(Optional) Additional custom parameters for Workday Report This field is deprecated and should not be used. Use parameters instead.
- as_dict() dict¶
Serializes the IngestionPipelineDefinitionWorkdayReportParameters into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionPipelineDefinitionWorkdayReportParameters into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionPipelineDefinitionWorkdayReportParameters¶
Deserializes the IngestionPipelineDefinitionWorkdayReportParameters from a dictionary.
- class databricks.sdk.service.pipelines.IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue(key: 'Optional[str]' = None, value: 'Optional[str]' = None)¶
- key: str | None = None¶
Key for the report parameter, can be a column name or other metadata
- value: str | None = None¶
Value for the report parameter. Possible values it can take are these sql functions: 1. coalesce(current_offset(), date(“YYYY-MM-DD”)) -> if current_offset() is null, then the passed date, else current_offset() 2. current_date() 3. date_sub(current_date(), x) -> subtract x (some non-negative integer) days from current date
- as_dict() dict¶
Serializes the IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue¶
Deserializes the IngestionPipelineDefinitionWorkdayReportParametersQueryKeyValue from a dictionary.
- class databricks.sdk.service.pipelines.IngestionSourceType¶
- BIGQUERY = "BIGQUERY"¶
- CONFLUENCE = "CONFLUENCE"¶
- DYNAMICS365 = "DYNAMICS365"¶
- FOREIGN_CATALOG = "FOREIGN_CATALOG"¶
- GA4_RAW_DATA = "GA4_RAW_DATA"¶
- GOOGLE_DRIVE = "GOOGLE_DRIVE"¶
- JIRA = "JIRA"¶
- MANAGED_POSTGRESQL = "MANAGED_POSTGRESQL"¶
- META_MARKETING = "META_MARKETING"¶
- MYSQL = "MYSQL"¶
- NETSUITE = "NETSUITE"¶
- ORACLE = "ORACLE"¶
- POSTGRESQL = "POSTGRESQL"¶
- SALESFORCE = "SALESFORCE"¶
- SERVICENOW = "SERVICENOW"¶
- SHAREPOINT = "SHAREPOINT"¶
- SQLSERVER = "SQLSERVER"¶
- TERADATA = "TERADATA"¶
- WORKDAY_RAAS = "WORKDAY_RAAS"¶
- ZENDESK = "ZENDESK"¶
- class databricks.sdk.service.pipelines.JiraConnectorOptions(include_jira_spaces: List[str] | None = None)¶
Jira specific options for ingestion
- include_jira_spaces: List[str] | None = None¶
(Optional) Projects to filter Jira data on
- as_dict() dict¶
Serializes the JiraConnectorOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the JiraConnectorOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) JiraConnectorOptions¶
Deserializes the JiraConnectorOptions from a dictionary.
- class databricks.sdk.service.pipelines.JsonTransformerOptions(as_variant: 'Optional[bool]' = None, schema: 'Optional[str]' = None, schema_evolution_mode: 'Optional[FileIngestionOptionsSchemaEvolutionMode]' = None, schema_file_path: 'Optional[str]' = None, schema_hints: 'Optional[str]' = None)¶
- as_variant: bool | None = None¶
Parse the entire value as a single Variant column.
- schema: str | None = None¶
Inline schema string for JSON parsing (Spark DDL format).
- schema_evolution_mode: FileIngestionOptionsSchemaEvolutionMode | None = None¶
(Optional) Schema evolution mode for schema inference.
- schema_file_path: str | None = None¶
Path to a schema file (.ddl).
- schema_hints: str | None = None¶
(Optional) Schema hints as a comma-separated string of “column_name type” pairs.
- as_dict() dict¶
Serializes the JsonTransformerOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the JsonTransformerOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) JsonTransformerOptions¶
Deserializes the JsonTransformerOptions from a dictionary.
- class databricks.sdk.service.pipelines.KafkaOptions(client_config: 'Optional[Dict[str, str]]' = None, key_transformer: 'Optional[Transformer]' = None, max_offsets_per_trigger: 'Optional[int]' = None, starting_offset: 'Optional[str]' = None, topic_pattern: 'Optional[str]' = None, topics: 'Optional[List[str]]' = None, value_transformer: 'Optional[Transformer]' = None)¶
- client_config: Dict[str, str] | None = None¶
Undocumented backdoor mechanism for overriding parameters to pass to the Kafka client. This is not supported and may break at any time.
- key_transformer: Transformer | None = None¶
(Optional) Transformer for the message key. If not specified, the key is left as raw bytes.
- max_offsets_per_trigger: int | None = None¶
Internal option to control the maximum number of offsets to process per trigger.
- starting_offset: str | None = None¶
(Optional) Where to begin reading when no checkpoint exists. Valid values: “latest” and “earliest”. Defaults to “latest”.
- topic_pattern: str | None = None¶
Java regex pattern to subscribe to matching topics. Only one of topics or topic_pattern must be specified.
- topics: List[str] | None = None¶
Topics to subscribe to. Only one of topics or topic_pattern must be specified.
- value_transformer: Transformer | None = None¶
(Optional) Transformer for the message value. If not specified, the value is left as raw bytes.
- as_dict() dict¶
Serializes the KafkaOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the KafkaOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) KafkaOptions¶
Deserializes the KafkaOptions from a dictionary.
- class databricks.sdk.service.pipelines.ListPipelineEventsResponse(events: 'Optional[List[PipelineEvent]]' = None, next_page_token: 'Optional[str]' = None, prev_page_token: 'Optional[str]' = None)¶
- events: List[PipelineEvent] | None = None¶
The list of events matching the request criteria.
- next_page_token: str | None = None¶
If present, a token to fetch the next page of events.
- prev_page_token: str | None = None¶
If present, a token to fetch the previous page of events.
- as_dict() dict¶
Serializes the ListPipelineEventsResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ListPipelineEventsResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ListPipelineEventsResponse¶
Deserializes the ListPipelineEventsResponse from a dictionary.
- class databricks.sdk.service.pipelines.ListPipelinesResponse(next_page_token: 'Optional[str]' = None, statuses: 'Optional[List[PipelineStateInfo]]' = None)¶
- next_page_token: str | None = None¶
If present, a token to fetch the next page of events.
- statuses: List[PipelineStateInfo] | None = None¶
The list of events matching the request criteria.
- as_dict() dict¶
Serializes the ListPipelinesResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ListPipelinesResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ListPipelinesResponse¶
Deserializes the ListPipelinesResponse from a dictionary.
- class databricks.sdk.service.pipelines.ListUpdatesResponse(next_page_token: 'Optional[str]' = None, prev_page_token: 'Optional[str]' = None, updates: 'Optional[List[UpdateInfo]]' = None)¶
- next_page_token: str | None = None¶
If present, then there are more results, and this a token to be used in a subsequent request to fetch the next page.
- prev_page_token: str | None = None¶
If present, then this token can be used in a subsequent request to fetch the previous page.
- updates: List[UpdateInfo] | None = None¶
- as_dict() dict¶
Serializes the ListUpdatesResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ListUpdatesResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ListUpdatesResponse¶
Deserializes the ListUpdatesResponse from a dictionary.
- class databricks.sdk.service.pipelines.ManualTrigger¶
- as_dict() dict¶
Serializes the ManualTrigger into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ManualTrigger into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ManualTrigger¶
Deserializes the ManualTrigger from a dictionary.
- class databricks.sdk.service.pipelines.MaturityLevel¶
Maturity level for EventDetails.
- DEPRECATED = "DEPRECATED"¶
- EVOLVING = "EVOLVING"¶
- STABLE = "STABLE"¶
- class databricks.sdk.service.pipelines.MetaMarketingOptions(action_attribution_windows: List[str] | None = None, action_breakdowns: List[str] | None = None, action_report_time: str | None = None, breakdowns: List[str] | None = None, custom_insights_lookback_window: int | None = None, level: str | None = None, start_date: str | None = None, time_increment: str | None = None)¶
Meta Marketing (Meta Ads) specific options for ingestion
- action_attribution_windows: List[str] | None = None¶
(Optional, DEPRECATED — use custom_report_options.action_attribution_windows) Action attribution windows for insights reporting (e.g. “28d_click”, “1d_view”)
- action_breakdowns: List[str] | None = None¶
(Optional, DEPRECATED — use custom_report_options.action_breakdowns) Action breakdowns
- action_report_time: str | None = None¶
(Optional, DEPRECATED — use custom_report_options.action_report_time) Timing used to report action statistics (impression, conversion, mixed, or lifetime)
- breakdowns: List[str] | None = None¶
(Optional, DEPRECATED — use custom_report_options.breakdowns) Breakdowns to configure
- custom_insights_lookback_window: int | None = None¶
(Optional) Window in days to revisit data during sync to capture updated conversion data from the API, shared by prebuilt and custom reports.
- level: str | None = None¶
(Optional, DEPRECATED — use custom_report_options.level) Granularity of data to pull (account, ad, adset, campaign)
- start_date: str | None = None¶
(Optional) Start date in yyyy-MM-dd format (e.g. 2025-01-15). Data added after this date will be ingested, shared by prebuilt and custom reports.
- time_increment: str | None = None¶
(Optional, DEPRECATED — use custom_report_options.time_increment) Value in string by which to aggregate statistics (can take all_days, monthly or number of days)
- as_dict() dict¶
Serializes the MetaMarketingOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the MetaMarketingOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) MetaMarketingOptions¶
Deserializes the MetaMarketingOptions from a dictionary.
- class databricks.sdk.service.pipelines.NotebookLibrary(path: 'Optional[str]' = None)¶
- path: str | None = None¶
The absolute path of the source code.
- as_dict() dict¶
Serializes the NotebookLibrary into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the NotebookLibrary into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) NotebookLibrary¶
Deserializes the NotebookLibrary from a dictionary.
- class databricks.sdk.service.pipelines.Notifications(alerts: 'Optional[List[str]]' = None, email_recipients: 'Optional[List[str]]' = None)¶
- alerts: List[str] | None = None¶
A list of alerts that trigger the sending of notifications to the configured destinations. The supported alerts are:
on-update-success: A pipeline update completes successfully. * on-update-failure: Each
time a pipeline update fails. * on-update-fatal-failure: A pipeline update fails with a non-retryable (fatal) error. * on-flow-failure: A single data flow fails.
- email_recipients: List[str] | None = None¶
A list of email addresses notified when a configured alert is triggered.
- as_dict() dict¶
Serializes the Notifications into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Notifications into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) Notifications¶
Deserializes the Notifications from a dictionary.
- class databricks.sdk.service.pipelines.OperationTimeWindow(start_hour: int, days_of_week: List[DayOfWeek] | None = None, time_zone_id: str | None = None)¶
Proto representing a window
- start_hour: int¶
An integer between 0 and 23 denoting the start hour for the window in the 24-hour day.
- days_of_week: List[DayOfWeek] | None = None¶
Days of week in which the window is allowed to happen If not specified all days of the week will be used.
- time_zone_id: str | None = None¶
Time zone id of window. See https://docs.databricks.com/sql/language-manual/sql-ref-syntax-aux-conf-mgmt-set-timezone.html for details. If not specified, UTC will be used.
- as_dict() dict¶
Serializes the OperationTimeWindow into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the OperationTimeWindow into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) OperationTimeWindow¶
Deserializes the OperationTimeWindow from a dictionary.
- class databricks.sdk.service.pipelines.Origin(batch_id: 'Optional[int]' = None, cloud: 'Optional[str]' = None, cluster_id: 'Optional[str]' = None, dataset_name: 'Optional[str]' = None, flow_id: 'Optional[str]' = None, flow_name: 'Optional[str]' = None, host: 'Optional[str]' = None, ingestion_source_catalog_name: 'Optional[str]' = None, ingestion_source_connection_name: 'Optional[str]' = None, ingestion_source_schema_name: 'Optional[str]' = None, ingestion_source_table_name: 'Optional[str]' = None, ingestion_source_table_version: 'Optional[str]' = None, maintenance_id: 'Optional[str]' = None, materialization_name: 'Optional[str]' = None, org_id: 'Optional[int]' = None, pipeline_id: 'Optional[str]' = None, pipeline_name: 'Optional[str]' = None, region: 'Optional[str]' = None, request_id: 'Optional[str]' = None, table_id: 'Optional[str]' = None, uc_resource_id: 'Optional[str]' = None, update_id: 'Optional[str]' = None)¶
- batch_id: int | None = None¶
The id of a batch. Unique within a flow.
- cloud: str | None = None¶
The cloud provider, e.g., AWS or Azure.
- cluster_id: str | None = None¶
The id of the cluster where an execution happens. Unique within a region.
- dataset_name: str | None = None¶
The name of a dataset. Unique within a pipeline.
- flow_id: str | None = None¶
The id of the flow. Globally unique. Incremental queries will generally reuse the same id while complete queries will have a new id per update.
- flow_name: str | None = None¶
The name of the flow. Not unique.
- host: str | None = None¶
The optional host name where the event was triggered
- ingestion_source_catalog_name: str | None = None¶
The name of the source catalog name (if known) from whose data ingestion is described by this event.
- ingestion_source_connection_name: str | None = None¶
The name of the source UC connection (if known) from whose data ingestion is described by this event.
- ingestion_source_schema_name: str | None = None¶
The name of the source schema name (if known) from whose data ingestion is described by this event.
- ingestion_source_table_name: str | None = None¶
The name of the source table name (if known) from whose data ingestion is described by this event.
- ingestion_source_table_version: str | None = None¶
An optional implementation-defined source table version of a dataset being (re)ingested.
- maintenance_id: str | None = None¶
The id of a maintenance run. Globally unique.
- materialization_name: str | None = None¶
Materialization name.
- org_id: int | None = None¶
The org id of the user. Unique within a cloud.
- pipeline_id: str | None = None¶
The id of the pipeline. Globally unique.
- pipeline_name: str | None = None¶
The name of the pipeline. Not unique.
- region: str | None = None¶
The cloud region.
- request_id: str | None = None¶
The id of the request that caused an update.
- table_id: str | None = None¶
The id of a (delta) table. Globally unique.
- uc_resource_id: str | None = None¶
The Unity Catalog id of the MV or ST being updated.
- update_id: str | None = None¶
The id of an execution. Globally unique.
- as_dict() dict¶
Serializes the Origin into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Origin into a shallow dictionary of its immediate attributes.
- class databricks.sdk.service.pipelines.OutlookAttachmentMode¶
Attachment behavior mode for Outlook ingestion
- ALL = "ALL"¶
- INLINE_ONLY = "INLINE_ONLY"¶
- NONE = "NONE"¶
- NON_INLINE_ONLY = "NON_INLINE_ONLY"¶
- class databricks.sdk.service.pipelines.OutlookBodyFormat¶
Body format for Outlook email content
- TEXT_HTML = "TEXT_HTML"¶
- TEXT_PLAIN = "TEXT_PLAIN"¶
- class databricks.sdk.service.pipelines.OutlookOptions(attachment_mode: OutlookAttachmentMode | None = None, body_format: OutlookBodyFormat | None = None, folder_filter: List[str] | None = None, include_folders: List[str] | None = None, include_mailboxes: List[str] | None = None, include_senders: List[str] | None = None, include_subjects: List[str] | None = None, sender_filter: List[str] | None = None, start_date: str | None = None, subject_filter: List[str] | None = None)¶
Outlook specific options for ingestion
- attachment_mode: OutlookAttachmentMode | None = None¶
(Optional) Controls which attachments to ingest. If not specified, defaults to ALL.
- body_format: OutlookBodyFormat | None = None¶
(Optional) Defines how the body_content column is populated. TEXT_HTML: Preserves full formatting, links, and styling. TEXT_PLAIN: Converts body to plain text. Recommended for AI/RAG pipelines to reduce token usage and noise.
- folder_filter: List[str] | None = None¶
Deprecated. Use include_folders instead.
- include_folders: List[str] | None = None¶
(Optional) Filter mail folders to include in the sync. If not specified, all folders will be synced. Examples: Inbox, Sent Items, Custom_Folder Filter semantics: OR between different folders.
- include_mailboxes: List[str] | None = None¶
(Optional) List of mailboxes to sync (e.g. mailbox email addresses or identifiers). If not specified, all accessible mailboxes are ingested. Filter semantics: OR between different mailboxes.
- include_senders: List[str] | None = None¶
(Optional) Filter emails by sender address. Uses exact email match. Examples: user@vendor.com, alerts@system.io, noreply@company.com If not specified, emails from all senders will be synced. Filter semantics: OR between different senders.
- include_subjects: List[str] | None = None¶
(Optional) Filter emails by subject line. Values ending with “*” use prefix match (subject starts with the part before “*”); otherwise substring match (subject contains the value). Examples: “Invoice” (substring), “Re:” (prefix), “Support Ticket”, “URGENT” If not specified, emails with all subjects will be synced. Filter semantics: OR between different subjects.
- sender_filter: List[str] | None = None¶
Deprecated. Use include_senders instead.
- start_date: str | None = None¶
(Optional) Start date for the initial sync in YYYY-MM-DD format. Format: YYYY-MM-DD (e.g., 2024-01-01) This determines the earliest date from which to sync historical data. If not specified, complete history is ingested.
- subject_filter: List[str] | None = None¶
Deprecated. Use include_subjects instead.
- as_dict() dict¶
Serializes the OutlookOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the OutlookOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) OutlookOptions¶
Deserializes the OutlookOptions from a dictionary.
- class databricks.sdk.service.pipelines.PathPattern(include: 'Optional[str]' = None)¶
- include: str | None = None¶
The source code to include for pipelines
- as_dict() dict¶
Serializes the PathPattern into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PathPattern into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PathPattern¶
Deserializes the PathPattern from a dictionary.
- class databricks.sdk.service.pipelines.PipelineAccessControlRequest(group_name: 'Optional[str]' = None, permission_level: 'Optional[PipelinePermissionLevel]' = None, service_principal_name: 'Optional[str]' = None, user_name: 'Optional[str]' = None)¶
- group_name: str | None = None¶
name of the group
- permission_level: PipelinePermissionLevel | None = None¶
- service_principal_name: str | None = None¶
application ID of a service principal
- user_name: str | None = None¶
name of the user
- as_dict() dict¶
Serializes the PipelineAccessControlRequest into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineAccessControlRequest into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineAccessControlRequest¶
Deserializes the PipelineAccessControlRequest from a dictionary.
- class databricks.sdk.service.pipelines.PipelineAccessControlResponse(all_permissions: 'Optional[List[PipelinePermission]]' = None, display_name: 'Optional[str]' = None, group_name: 'Optional[str]' = None, service_principal_name: 'Optional[str]' = None, user_name: 'Optional[str]' = None)¶
- all_permissions: List[PipelinePermission] | None = None¶
All permissions.
- display_name: str | None = None¶
Display name of the user or service principal.
- group_name: str | None = None¶
name of the group
- service_principal_name: str | None = None¶
Name of the service principal.
- user_name: str | None = None¶
name of the user
- as_dict() dict¶
Serializes the PipelineAccessControlResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineAccessControlResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineAccessControlResponse¶
Deserializes the PipelineAccessControlResponse from a dictionary.
- class databricks.sdk.service.pipelines.PipelineCluster(apply_policy_default_values: 'Optional[bool]' = None, autoscale: 'Optional[PipelineClusterAutoscale]' = None, aws_attributes: 'Optional[compute.AwsAttributes]' = None, azure_attributes: 'Optional[compute.AzureAttributes]' = None, cluster_log_conf: 'Optional[compute.ClusterLogConf]' = None, custom_tags: 'Optional[Dict[str, str]]' = None, driver_instance_pool_id: 'Optional[str]' = None, driver_node_type_id: 'Optional[str]' = None, enable_local_disk_encryption: 'Optional[bool]' = None, gcp_attributes: 'Optional[compute.GcpAttributes]' = None, init_scripts: 'Optional[List[compute.InitScriptInfo]]' = None, instance_pool_id: 'Optional[str]' = None, label: 'Optional[str]' = None, node_type_id: 'Optional[str]' = None, num_workers: 'Optional[int]' = None, policy_id: 'Optional[str]' = None, spark_conf: 'Optional[Dict[str, str]]' = None, spark_env_vars: 'Optional[Dict[str, str]]' = None, ssh_public_keys: 'Optional[List[str]]' = None)¶
- apply_policy_default_values: bool | None = None¶
Note: This field won’t be persisted. Only API users will check this field.
- autoscale: PipelineClusterAutoscale | None = None¶
Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.
- aws_attributes: AwsAttributes | None = None¶
Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.
- azure_attributes: AzureAttributes | None = None¶
Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.
- cluster_log_conf: ClusterLogConf | None = None¶
The configuration for delivering spark logs to a long-term storage destination. Only dbfs destinations are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.
- custom_tags: Dict[str, str] | None = None¶
Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:
Currently, Databricks allows at most 45 custom tags
Clusters can only reuse cloud resources if the resources’ tags are a subset of the cluster
tags
- driver_instance_pool_id: str | None = None¶
The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.
- driver_node_type_id: str | None = None¶
The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.
- enable_local_disk_encryption: bool | None = None¶
Whether to enable local disk encryption for the cluster.
- gcp_attributes: GcpAttributes | None = None¶
Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.
- init_scripts: List[InitScriptInfo] | None = None¶
The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.
- instance_pool_id: str | None = None¶
The optional ID of the instance pool to which the cluster belongs.
- label: str | None = None¶
A label for the cluster specification, either default to configure the default cluster, or maintenance to configure the maintenance cluster. This field is optional. The default value is default.
- node_type_id: str | None = None¶
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.
- num_workers: int | None = None¶
Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
- policy_id: str | None = None¶
The ID of the cluster policy used to create the cluster if applicable.
- spark_conf: Dict[str, str] | None = None¶
An object containing a set of optional, user-specified Spark configuration key-value pairs. See :method:clusters/create for more details.
- spark_env_vars: Dict[str, str] | None = None¶
An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X=’Y’) while launching the driver and workers.
In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.
Example Spark environment variables: {“SPARK_WORKER_MEMORY”: “28000m”, “SPARK_LOCAL_DIRS”: “/local_disk0”} or {“SPARK_DAEMON_JAVA_OPTS”: “$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true”}
- ssh_public_keys: List[str] | None = None¶
SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.
- as_dict() dict¶
Serializes the PipelineCluster into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineCluster into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineCluster¶
Deserializes the PipelineCluster from a dictionary.
- class databricks.sdk.service.pipelines.PipelineClusterAutoscale(min_workers: 'int', max_workers: 'int', mode: 'Optional[PipelineClusterAutoscaleMode]' = None)¶
- min_workers: int¶
The minimum number of workers the cluster can scale down to when underutilized. It is also the initial number of workers the cluster will have after creation.
- max_workers: int¶
The maximum number of workers to which the cluster can scale up when overloaded. max_workers must be strictly greater than min_workers.
- mode: PipelineClusterAutoscaleMode | None = None¶
Databricks Enhanced Autoscaling optimizes cluster utilization by automatically allocating cluster resources based on workload volume, with minimal impact to the data processing latency of your pipelines. Enhanced Autoscaling is available for updates clusters only. The legacy autoscaling feature is used for maintenance clusters.
- as_dict() dict¶
Serializes the PipelineClusterAutoscale into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineClusterAutoscale into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineClusterAutoscale¶
Deserializes the PipelineClusterAutoscale from a dictionary.
- class databricks.sdk.service.pipelines.PipelineClusterAutoscaleMode¶
Databricks Enhanced Autoscaling optimizes cluster utilization by automatically allocating cluster resources based on workload volume, with minimal impact to the data processing latency of your pipelines. Enhanced Autoscaling is available for updates clusters only. The legacy autoscaling feature is used for maintenance clusters.
- ENHANCED = "ENHANCED"¶
- LEGACY = "LEGACY"¶
- class databricks.sdk.service.pipelines.PipelineDeployment(kind: 'DeploymentKind', deployment_id: 'Optional[str]' = None, metadata_file_path: 'Optional[str]' = None, version_id: 'Optional[str]' = None)¶
- kind: DeploymentKind¶
The deployment method that manages the pipeline.
- deployment_id: str | None = None¶
ID of the deployment that manages this pipeline. Only set when kind is BUNDLE. Used to look up deployment metadata from the Deployment Metadata service.
- metadata_file_path: str | None = None¶
The path to the file containing metadata about the deployment.
- version_id: str | None = None¶
ID of the version of the deployment that produced this pipeline. Only set when kind is BUNDLE. Identifies a specific snapshot of the deployment in the Deployment Metadata service.
- as_dict() dict¶
Serializes the PipelineDeployment into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineDeployment into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineDeployment¶
Deserializes the PipelineDeployment from a dictionary.
- class databricks.sdk.service.pipelines.PipelineEvent(error: 'Optional[ErrorDetail]' = None, event_type: 'Optional[str]' = None, id: 'Optional[str]' = None, level: 'Optional[EventLevel]' = None, maturity_level: 'Optional[MaturityLevel]' = None, message: 'Optional[str]' = None, origin: 'Optional[Origin]' = None, sequence: 'Optional[Sequencing]' = None, timestamp: 'Optional[str]' = None, truncation: 'Optional[Truncation]' = None)¶
- error: ErrorDetail | None = None¶
Information about an error captured by the event.
- event_type: str | None = None¶
The event type. Should always correspond to the details
- id: str | None = None¶
A time-based, globally unique id.
- level: EventLevel | None = None¶
The severity level of the event.
- maturity_level: MaturityLevel | None = None¶
Maturity level for event_type.
- message: str | None = None¶
The display message associated with the event.
- sequence: Sequencing | None = None¶
A sequencing object to identify and order events.
- timestamp: str | None = None¶
The time of the event.
- truncation: Truncation | None = None¶
Information about which fields were truncated from this event due to size constraints. If empty or absent, no truncation occurred. See https://docs.databricks.com/en/ldp/monitor-event-logs for information on retrieving complete event data.
- as_dict() dict¶
Serializes the PipelineEvent into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineEvent into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineEvent¶
Deserializes the PipelineEvent from a dictionary.
- class databricks.sdk.service.pipelines.PipelineLibrary(file: 'Optional[FileLibrary]' = None, glob: 'Optional[PathPattern]' = None, jar: 'Optional[str]' = None, maven: 'Optional[compute.MavenLibrary]' = None, notebook: 'Optional[NotebookLibrary]' = None, whl: 'Optional[str]' = None)¶
- file: FileLibrary | None = None¶
The path to a file that defines a pipeline and is stored in the Databricks Repos.
- glob: PathPattern | None = None¶
The unified field to include source codes. Each entry can be a notebook path, a file path, or a folder path that ends /**. This field cannot be used together with notebook or file.
- jar: str | None = None¶
URI of the jar to be installed. Currently only DBFS is supported.
- maven: MavenLibrary | None = None¶
Specification of a maven library to be installed.
- notebook: NotebookLibrary | None = None¶
The path to a notebook that defines a pipeline and is stored in the Databricks workspace.
- whl: str | None = None¶
URI of the whl to be installed.
- as_dict() dict¶
Serializes the PipelineLibrary into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineLibrary into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineLibrary¶
Deserializes the PipelineLibrary from a dictionary.
- class databricks.sdk.service.pipelines.PipelinePermission(inherited: 'Optional[bool]' = None, inherited_from_object: 'Optional[List[str]]' = None, permission_level: 'Optional[PipelinePermissionLevel]' = None)¶
- inherited: bool | None = None¶
- inherited_from_object: List[str] | None = None¶
- permission_level: PipelinePermissionLevel | None = None¶
- as_dict() dict¶
Serializes the PipelinePermission into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelinePermission into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelinePermission¶
Deserializes the PipelinePermission from a dictionary.
- class databricks.sdk.service.pipelines.PipelinePermissionLevel¶
Permission level
- CAN_MANAGE = "CAN_MANAGE"¶
- CAN_RUN = "CAN_RUN"¶
- CAN_VIEW = "CAN_VIEW"¶
- IS_OWNER = "IS_OWNER"¶
- class databricks.sdk.service.pipelines.PipelinePermissions(access_control_list: 'Optional[List[PipelineAccessControlResponse]]' = None, object_id: 'Optional[str]' = None, object_type: 'Optional[str]' = None)¶
- access_control_list: List[PipelineAccessControlResponse] | None = None¶
- object_id: str | None = None¶
- object_type: str | None = None¶
- as_dict() dict¶
Serializes the PipelinePermissions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelinePermissions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelinePermissions¶
Deserializes the PipelinePermissions from a dictionary.
- class databricks.sdk.service.pipelines.PipelinePermissionsDescription(description: 'Optional[str]' = None, permission_level: 'Optional[PipelinePermissionLevel]' = None)¶
- description: str | None = None¶
- permission_level: PipelinePermissionLevel | None = None¶
- as_dict() dict¶
Serializes the PipelinePermissionsDescription into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelinePermissionsDescription into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelinePermissionsDescription¶
Deserializes the PipelinePermissionsDescription from a dictionary.
- class databricks.sdk.service.pipelines.PipelineSpec(budget_policy_id: 'Optional[str]' = None, catalog: 'Optional[str]' = None, channel: 'Optional[str]' = None, clusters: 'Optional[List[PipelineCluster]]' = None, configuration: 'Optional[Dict[str, str]]' = None, continuous: 'Optional[bool]' = None, deployment: 'Optional[PipelineDeployment]' = None, development: 'Optional[bool]' = None, edition: 'Optional[str]' = None, environment: 'Optional[PipelinesEnvironment]' = None, event_log: 'Optional[EventLogSpec]' = None, filters: 'Optional[Filters]' = None, gateway_definition: 'Optional[IngestionGatewayPipelineDefinition]' = None, id: 'Optional[str]' = None, ingestion_definition: 'Optional[IngestionPipelineDefinition]' = None, libraries: 'Optional[List[PipelineLibrary]]' = None, name: 'Optional[str]' = None, notifications: 'Optional[List[Notifications]]' = None, photon: 'Optional[bool]' = None, restart_window: 'Optional[RestartWindow]' = None, root_path: 'Optional[str]' = None, schema: 'Optional[str]' = None, serverless: 'Optional[bool]' = None, storage: 'Optional[str]' = None, tags: 'Optional[Dict[str, str]]' = None, target: 'Optional[str]' = None, trigger: 'Optional[PipelineTrigger]' = None, usage_policy_id: 'Optional[str]' = None)¶
- budget_policy_id: str | None = None¶
Budget policy of this pipeline.
- catalog: str | None = None¶
A catalog in Unity Catalog to publish data from this pipeline to. If target is specified, tables in this pipeline are published to a target schema inside catalog (for example, catalog.`target`.`table`). If target is not specified, no data is published to Unity Catalog.
- channel: str | None = None¶
SDP Release Channel that specifies which version to use.
- clusters: List[PipelineCluster] | None = None¶
Cluster settings for this pipeline deployment.
- configuration: Dict[str, str] | None = None¶
String-String configuration for this pipeline execution.
- continuous: bool | None = None¶
Whether the pipeline is continuous or triggered. This replaces trigger.
- deployment: PipelineDeployment | None = None¶
Deployment type of this pipeline.
- development: bool | None = None¶
Whether the pipeline is in Development mode. Defaults to false.
- edition: str | None = None¶
Pipeline product edition.
- environment: PipelinesEnvironment | None = None¶
Environment specification for this pipeline used to install dependencies.
- event_log: EventLogSpec | None = None¶
Event log configuration for this pipeline
- filters: Filters | None = None¶
Filters on which Pipeline packages to include in the deployed graph.
- gateway_definition: IngestionGatewayPipelineDefinition | None = None¶
The definition of a gateway pipeline to support change data capture.
- id: str | None = None¶
Unique identifier for this pipeline.
- ingestion_definition: IngestionPipelineDefinition | None = None¶
The configuration for a managed ingestion pipeline. These settings cannot be used with the ‘libraries’, ‘schema’, ‘target’, or ‘catalog’ settings.
- libraries: List[PipelineLibrary] | None = None¶
Libraries or code needed by this deployment.
- name: str | None = None¶
Friendly identifier for this pipeline.
- notifications: List[Notifications] | None = None¶
List of notification settings for this pipeline.
- photon: bool | None = None¶
Whether Photon is enabled for this pipeline.
- restart_window: RestartWindow | None = None¶
Restart window of this pipeline.
- root_path: str | None = None¶
Root path for this pipeline. This is used as the root directory when editing the pipeline in the Databricks user interface and it is added to sys.path when executing Python sources during pipeline execution.
- schema: str | None = None¶
The default schema (database) where tables are read from or published to.
- serverless: bool | None = None¶
Whether serverless compute is enabled for this pipeline.
- storage: str | None = None¶
DBFS root directory for storing checkpoints and tables.
- tags: Dict[str, str] | None = None¶
A map of tags associated with the pipeline. These are forwarded to the cluster as cluster tags, and are therefore subject to the same limitations. A maximum of 25 tags can be added to the pipeline.
- target: str | None = None¶
Target schema (database) to add tables in this pipeline to. Exactly one of schema or target must be specified. To publish to Unity Catalog, also specify catalog. This legacy field is deprecated for pipeline creation in favor of the schema field.
- trigger: PipelineTrigger | None = None¶
Which pipeline trigger to use. Deprecated: Use continuous instead.
- usage_policy_id: str | None = None¶
Usage policy of this pipeline.
- as_dict() dict¶
Serializes the PipelineSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineSpec¶
Deserializes the PipelineSpec from a dictionary.
- class databricks.sdk.service.pipelines.PipelineState¶
The pipeline state.
- DELETED = "DELETED"¶
- DEPLOYING = "DEPLOYING"¶
- FAILED = "FAILED"¶
- IDLE = "IDLE"¶
- RECOVERING = "RECOVERING"¶
- RESETTING = "RESETTING"¶
- RUNNING = "RUNNING"¶
- STARTING = "STARTING"¶
- STOPPING = "STOPPING"¶
- class databricks.sdk.service.pipelines.PipelineStateInfo(cluster_id: 'Optional[str]' = None, creator_user_name: 'Optional[str]' = None, health: 'Optional[PipelineStateInfoHealth]' = None, latest_updates: 'Optional[List[UpdateStateInfo]]' = None, name: 'Optional[str]' = None, pipeline_id: 'Optional[str]' = None, run_as_user_name: 'Optional[str]' = None, state: 'Optional[PipelineState]' = None)¶
- cluster_id: str | None = None¶
The unique identifier of the cluster running the pipeline.
- creator_user_name: str | None = None¶
The username of the pipeline creator.
- health: PipelineStateInfoHealth | None = None¶
The health of a pipeline.
- latest_updates: List[UpdateStateInfo] | None = None¶
Status of the latest updates for the pipeline. Ordered with the newest update first.
- name: str | None = None¶
The user-friendly name of the pipeline.
- pipeline_id: str | None = None¶
The unique identifier of the pipeline.
- run_as_user_name: str | None = None¶
The username that the pipeline runs as. This is a read only value derived from the pipeline owner.
- state: PipelineState | None = None¶
- as_dict() dict¶
Serializes the PipelineStateInfo into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineStateInfo into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineStateInfo¶
Deserializes the PipelineStateInfo from a dictionary.
- class databricks.sdk.service.pipelines.PipelineStateInfoHealth¶
The health of a pipeline.
- HEALTHY = "HEALTHY"¶
- UNHEALTHY = "UNHEALTHY"¶
- class databricks.sdk.service.pipelines.PipelineTrigger(cron: 'Optional[CronTrigger]' = None, manual: 'Optional[ManualTrigger]' = None)¶
- cron: CronTrigger | None = None¶
- manual: ManualTrigger | None = None¶
- as_dict() dict¶
Serializes the PipelineTrigger into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelineTrigger into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelineTrigger¶
Deserializes the PipelineTrigger from a dictionary.
- class databricks.sdk.service.pipelines.PipelinesEnvironment(dependencies: List[str] | None = None, environment_version: str | None = None)¶
The environment entity used to preserve serverless environment side panel, jobs’ environment for non-notebook task, and SDP’s environment for classic and serverless pipelines. In this minimal environment spec, only pip dependencies are supported.
- dependencies: List[str] | None = None¶
List of pip dependencies, as supported by the version of pip in this environment. Each dependency is a pip requirement file line https://pip.pypa.io/en/stable/reference/requirements-file-format/ Allowed dependency could be <requirement specifier>, <archive url/path>, <local project path>(WSFS or Volumes in Databricks), <vcs project url>
- environment_version: str | None = None¶
The environment version of the serverless Python environment used to execute customer Python code. Each environment version includes a specific Python version and a curated set of pre-installed libraries with defined versions, providing a stable and reproducible execution environment.
Databricks supports a three-year lifecycle for each environment version. For available versions and their included packages, see https://docs.databricks.com/aws/en/release-notes/serverless/environment-version/
The value should be a string representing the environment version number, for example: “4”.
- as_dict() dict¶
Serializes the PipelinesEnvironment into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PipelinesEnvironment into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PipelinesEnvironment¶
Deserializes the PipelinesEnvironment from a dictionary.
- class databricks.sdk.service.pipelines.PostgresCatalogConfig(slot_config: PostgresSlotConfig | None = None)¶
PG-specific catalog-level configuration parameters
- slot_config: PostgresSlotConfig | None = None¶
Optional. The Postgres slot configuration to use for logical replication
- as_dict() dict¶
Serializes the PostgresCatalogConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PostgresCatalogConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PostgresCatalogConfig¶
Deserializes the PostgresCatalogConfig from a dictionary.
- class databricks.sdk.service.pipelines.PostgresSlotConfig(publication_name: str | None = None, slot_name: str | None = None)¶
PostgresSlotConfig contains the configuration for a Postgres logical replication slot
- publication_name: str | None = None¶
The name of the publication to use for the Postgres source
- slot_name: str | None = None¶
The name of the logical replication slot to use for the Postgres source
- as_dict() dict¶
Serializes the PostgresSlotConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the PostgresSlotConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) PostgresSlotConfig¶
Deserializes the PostgresSlotConfig from a dictionary.
- class databricks.sdk.service.pipelines.PublishingMode¶
Enum representing the publishing mode of a pipeline.
- DEFAULT_PUBLISHING_MODE = "DEFAULT_PUBLISHING_MODE"¶
- LEGACY_PUBLISHING_MODE = "LEGACY_PUBLISHING_MODE"¶
- class databricks.sdk.service.pipelines.ReplaceWhereOverride(flow_name: str | None = None, predicate_override: str | None = None)¶
Specifies a replace_where predicate override for a replace where flow.
- flow_name: str | None = None¶
Name of the flow to apply this override to.
- predicate_override: str | None = None¶
SQL predicate string to use as replace_where condition. Example: date = ‘2024-10-10’ AND city = ‘xyz’
- as_dict() dict¶
Serializes the ReplaceWhereOverride into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ReplaceWhereOverride into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ReplaceWhereOverride¶
Deserializes the ReplaceWhereOverride from a dictionary.
- class databricks.sdk.service.pipelines.ReportSpec(source_url: 'str', destination_catalog: 'str', destination_schema: 'str', destination_table: 'Optional[str]' = None, table_configuration: 'Optional[TableSpecificConfig]' = None)¶
- source_url: str¶
Required. Report URL in the source system.
- destination_catalog: str¶
Required. Destination catalog to store table.
- destination_schema: str¶
Required. Destination schema to store table.
- destination_table: str | None = None¶
Required. Destination table name. The pipeline fails if a table with that name already exists.
- table_configuration: TableSpecificConfig | None = None¶
Configuration settings to control the ingestion of tables. These settings override the table_configuration defined in the IngestionPipelineDefinition object.
- as_dict() dict¶
Serializes the ReportSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ReportSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ReportSpec¶
Deserializes the ReportSpec from a dictionary.
- class databricks.sdk.service.pipelines.RestartWindow(start_hour: 'int', days_of_week: 'Optional[List[DayOfWeek]]' = None, time_zone_id: 'Optional[str]' = None)¶
- start_hour: int¶
An integer between 0 and 23 denoting the start hour for the restart window in the 24-hour day. Continuous pipeline restart is triggered only within a five-hour window starting at this hour.
- days_of_week: List[DayOfWeek] | None = None¶
Days of week in which the restart is allowed to happen (within a five-hour window starting at start_hour). If not specified all days of the week will be used.
- time_zone_id: str | None = None¶
Time zone id of restart window. See https://docs.databricks.com/sql/language-manual/sql-ref-syntax-aux-conf-mgmt-set-timezone.html for details. If not specified, UTC will be used.
- as_dict() dict¶
Serializes the RestartWindow into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the RestartWindow into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) RestartWindow¶
Deserializes the RestartWindow from a dictionary.
- class databricks.sdk.service.pipelines.RewindDatasetSpec(cascade: bool | None = None, identifier: str | None = None, reset_checkpoints: bool | None = None)¶
Configuration for rewinding a specific dataset.
- cascade: bool | None = None¶
Whether to cascade the rewind to dependent datasets. Must be specified.
- identifier: str | None = None¶
The identifier of the dataset (e.g., “main.foo.tbl1”).
- reset_checkpoints: bool | None = None¶
Whether to reset checkpoints for this dataset.
- as_dict() dict¶
Serializes the RewindDatasetSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the RewindDatasetSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) RewindDatasetSpec¶
Deserializes the RewindDatasetSpec from a dictionary.
- class databricks.sdk.service.pipelines.RewindSpec(datasets: List[RewindDatasetSpec] | None = None, dry_run: bool | None = None, rewind_timestamp: str | None = None)¶
Information about a rewind being requested for this pipeline or some of the datasets in it.
- datasets: List[RewindDatasetSpec] | None = None¶
List of datasets to rewind with specific configuration for each. When not specified, all datasets will be rewound with cascade = true and reset_checkpoints = true.
- dry_run: bool | None = None¶
If true, this is a dry run and we should emit the RewindSummary but not perform the rewind.
- rewind_timestamp: str | None = None¶
The base timestamp to rewind to. Exactly one of rewind_timestamp or rewind_point_id must be specified.
- as_dict() dict¶
Serializes the RewindSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the RewindSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) RewindSpec¶
Deserializes the RewindSpec from a dictionary.
- class databricks.sdk.service.pipelines.RunAs(service_principal_name: str | None = None, user_name: str | None = None)¶
Write-only setting, available only in Create/Update calls. Specifies the user or service principal that the pipeline runs as. If not specified, the pipeline runs as the user who created the pipeline.
Only user_name or service_principal_name can be specified. If both are specified, an error is thrown.
- service_principal_name: str | None = None¶
Application ID of an active service principal. Setting this field requires the servicePrincipal/user role.
- user_name: str | None = None¶
The email of an active workspace user. Users can only set this field to their own email.
- as_dict() dict¶
Serializes the RunAs into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the RunAs into a shallow dictionary of its immediate attributes.
- class databricks.sdk.service.pipelines.SchemaSpec(source_schema: 'str', destination_catalog: 'str', destination_schema: 'str', connector_options: 'Optional[ConnectorOptions]' = None, source_catalog: 'Optional[str]' = None, table_configuration: 'Optional[TableSpecificConfig]' = None)¶
- source_schema: str¶
Required. Schema name in the source database.
- destination_catalog: str¶
Required. Destination catalog to store tables.
- destination_schema: str¶
Required. Destination schema to store tables in. Tables with the same name as the source tables are created in this destination schema. The pipeline fails If a table with the same name already exists.
- connector_options: ConnectorOptions | None = None¶
(Optional) Source Specific Connector Options
- source_catalog: str | None = None¶
The source catalog name. Might be optional depending on the type of source.
- table_configuration: TableSpecificConfig | None = None¶
Configuration settings to control the ingestion of tables. These settings are applied to all tables in this schema and override the table_configuration defined in the IngestionPipelineDefinition object.
- as_dict() dict¶
Serializes the SchemaSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the SchemaSpec into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) SchemaSpec¶
Deserializes the SchemaSpec from a dictionary.
- class databricks.sdk.service.pipelines.Sequencing(control_plane_seq_no: 'Optional[int]' = None, data_plane_id: 'Optional[DataPlaneId]' = None)¶
- control_plane_seq_no: int | None = None¶
A sequence number, unique and increasing per pipeline.
- data_plane_id: DataPlaneId | None = None¶
the ID assigned by the data plane.
- as_dict() dict¶
Serializes the Sequencing into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Sequencing into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) Sequencing¶
Deserializes the Sequencing from a dictionary.
- class databricks.sdk.service.pipelines.SerializedException(class_name: 'Optional[str]' = None, message: 'Optional[str]' = None, stack: 'Optional[List[StackFrame]]' = None)¶
- class_name: str | None = None¶
Runtime class of the exception
- message: str | None = None¶
Exception message
- stack: List[StackFrame] | None = None¶
Stack trace consisting of a list of stack frames
- as_dict() dict¶
Serializes the SerializedException into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the SerializedException into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) SerializedException¶
Deserializes the SerializedException from a dictionary.
(Optional) The type of SharePoint entity to ingest. If not specified, defaults to FILE.
(Optional) File ingestion options for processing files.
Required. The SharePoint URL.
Serializes the SharepointOptions into a dictionary suitable for use as a JSON request body.
Serializes the SharepointOptions into a shallow dictionary of its immediate attributes.
Deserializes the SharepointOptions from a dictionary.
- class databricks.sdk.service.pipelines.SmartsheetOptions(enforce_schema: bool | None = None)¶
Smartsheet specific options for ingestion
- enforce_schema: bool | None = None¶
(Optional) When true, maps each column to its Smartsheet-declared type (Text/Number/Date/ Checkbox/etc.). Cells that do not conform to the declared type are set to NULL. When false, all columns land as STRING. Use false for sheets with irregular data or columns that frequently violate their own declared type. If not specified, defaults to true.
- as_dict() dict¶
Serializes the SmartsheetOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the SmartsheetOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) SmartsheetOptions¶
Deserializes the SmartsheetOptions from a dictionary.
- class databricks.sdk.service.pipelines.SourceCatalogConfig(postgres: PostgresCatalogConfig | None = None, source_catalog: str | None = None)¶
SourceCatalogConfig contains catalog-level custom configuration parameters for each source
- postgres: PostgresCatalogConfig | None = None¶
Postgres-specific catalog-level configuration parameters
- source_catalog: str | None = None¶
Source catalog name
- as_dict() dict¶
Serializes the SourceCatalogConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the SourceCatalogConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) SourceCatalogConfig¶
Deserializes the SourceCatalogConfig from a dictionary.
- class databricks.sdk.service.pipelines.SourceConfig(catalog: 'Optional[SourceCatalogConfig]' = None, google_ads_config: 'Optional[GoogleAdsConfig]' = None)¶
- catalog: SourceCatalogConfig | None = None¶
Catalog-level source configuration parameters
- google_ads_config: GoogleAdsConfig | None = None¶
- as_dict() dict¶
Serializes the SourceConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the SourceConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) SourceConfig¶
Deserializes the SourceConfig from a dictionary.
- class databricks.sdk.service.pipelines.StackFrame(declaring_class: 'Optional[str]' = None, file_name: 'Optional[str]' = None, line_number: 'Optional[int]' = None, method_name: 'Optional[str]' = None)¶
- declaring_class: str | None = None¶
Class from which the method call originated
- file_name: str | None = None¶
File where the method is defined
- line_number: int | None = None¶
Line from which the method was called
- method_name: str | None = None¶
Name of the method which was called
- as_dict() dict¶
Serializes the StackFrame into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the StackFrame into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) StackFrame¶
Deserializes the StackFrame from a dictionary.
- class databricks.sdk.service.pipelines.StartUpdateCause¶
What triggered this update.
- API_CALL = "API_CALL"¶
- INFRASTRUCTURE_MAINTENANCE = "INFRASTRUCTURE_MAINTENANCE"¶
- JOB_TASK = "JOB_TASK"¶
- RETRY_ON_FAILURE = "RETRY_ON_FAILURE"¶
- SCHEMA_CHANGE = "SCHEMA_CHANGE"¶
- SERVICE_UPGRADE = "SERVICE_UPGRADE"¶
- USER_ACTION = "USER_ACTION"¶
- class databricks.sdk.service.pipelines.StartUpdateResponse(update_id: 'Optional[str]' = None)¶
- update_id: str | None = None¶
- as_dict() dict¶
Serializes the StartUpdateResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the StartUpdateResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) StartUpdateResponse¶
Deserializes the StartUpdateResponse from a dictionary.
- class databricks.sdk.service.pipelines.StopPipelineResponse¶
- as_dict() dict¶
Serializes the StopPipelineResponse into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the StopPipelineResponse into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) StopPipelineResponse¶
Deserializes the StopPipelineResponse from a dictionary.
- class databricks.sdk.service.pipelines.TableSpec(source_table: 'str', destination_catalog: 'str', destination_schema: 'str', connector_options: 'Optional[ConnectorOptions]' = None, destination_table: 'Optional[str]' = None, source_catalog: 'Optional[str]' = None, source_schema: 'Optional[str]' = None, table_configuration: 'Optional[TableSpecificConfig]' = None)¶
- source_table: str¶
Required. Table name in the source database.
- destination_catalog: str¶
Required. Destination catalog to store table.
- destination_schema: str¶
Required. Destination schema to store table.
- connector_options: ConnectorOptions | None = None¶
(Optional) Source Specific Connector Options
- destination_table: str | None = None¶
Optional. Destination table name. The pipeline fails if a table with that name already exists. If not set, the source table name is used.
- source_catalog: str | None = None¶
Source catalog name. Might be optional depending on the type of source.
- source_schema: str | None = None¶
Schema name in the source database. Might be optional depending on the type of source.
- table_configuration: TableSpecificConfig | None = None¶
Configuration settings to control the ingestion of tables. These settings override the table_configuration defined in the IngestionPipelineDefinition object and the SchemaSpec.
- as_dict() dict¶
Serializes the TableSpec into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the TableSpec into a shallow dictionary of its immediate attributes.
- class databricks.sdk.service.pipelines.TableSpecificConfig(auto_full_refresh_policy: 'Optional[AutoFullRefreshPolicy]' = None, clustering_columns: 'Optional[List[str]]' = None, enable_auto_clustering: 'Optional[bool]' = None, exclude_columns: 'Optional[List[str]]' = None, include_columns: 'Optional[List[str]]' = None, primary_keys: 'Optional[List[str]]' = None, query_based_connector_config: 'Optional[IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig]' = None, row_filter: 'Optional[str]' = None, salesforce_include_formula_fields: 'Optional[bool]' = None, scd_type: 'Optional[TableSpecificConfigScdType]' = None, sequence_by: 'Optional[List[str]]' = None, table_properties: 'Optional[Dict[str, str]]' = None, workday_report_parameters: 'Optional[IngestionPipelineDefinitionWorkdayReportParameters]' = None)¶
- auto_full_refresh_policy: AutoFullRefreshPolicy | None = None¶
(Optional, Mutable) Policy for auto full refresh, if enabled pipeline will automatically try to fix issues by doing a full refresh on the table in the retry run. auto_full_refresh_policy in table configuration will override the above level auto_full_refresh_policy. For example, { “auto_full_refresh_policy”: { “enabled”: true, “min_interval_hours”: 23, } } If unspecified, auto full refresh is disabled.
- clustering_columns: List[str] | None = None¶
List of column names to use for clustering the destination table. When specified, the destination Delta table will be clustered by these columns. This can improve query performance when filtering on these columns. Note: clustering_columns in table specific configuration will override the pipeline definition. Note: we can only provide enable_auto_clustering or clustering_columns, added as separate fields as we cannot have repeated field in oneof.
- enable_auto_clustering: bool | None = None¶
Whether to enable auto clustering on the destination table. When enabled, Delta will automatically optimize the data layout based on the clustering columns for improved query performance. Note: enable_auto_clustering in table specific configuration will override the pipeline definition. Note: we can only provide enable_auto_clustering or clustering_columns, added as separate fields as we cannot have repeated field in oneof.
- exclude_columns: List[str] | None = None¶
A list of column names to be excluded for the ingestion. When not specified, include_columns fully controls what columns to be ingested. When specified, all other columns including future ones will be automatically included for ingestion. This field in mutually exclusive with include_columns.
- include_columns: List[str] | None = None¶
A list of column names to be included for the ingestion. When not specified, all columns except ones in exclude_columns will be included. Future columns will be automatically included. When specified, all other future columns will be automatically excluded from ingestion. This field in mutually exclusive with exclude_columns.
- primary_keys: List[str] | None = None¶
The primary key of the table used to apply changes.
- query_based_connector_config: IngestionPipelineDefinitionTableSpecificConfigQueryBasedConnectorConfig | None = None¶
- row_filter: str | None = None¶
(Optional, Immutable) The row filter condition to be applied to the table. It must not contain the WHERE keyword, only the actual filter condition. It must be in DBSQL format.
- salesforce_include_formula_fields: bool | None = None¶
If true, formula fields defined in the table are included in the ingestion. This setting is only valid for the Salesforce connector
- scd_type: TableSpecificConfigScdType | None = None¶
- sequence_by: List[str] | None = None¶
The column names specifying the logical order of events in the source data. Spark Declarative Pipelines uses this sequencing to handle change events that arrive out of order.
- table_properties: Dict[str, str] | None = None¶
Table properties to set on the destination table. These are key-value pairs that configure various Delta table behaviors or any user defined properties. Example: {“delta.feature.variantType”: “supported”, “delta.enableTypeWidening”: “true”} Note: table_properties in table specific configuration will override the table_properties of the pipeline definition.
- workday_report_parameters: IngestionPipelineDefinitionWorkdayReportParameters | None = None¶
(Optional) Additional custom parameters for Workday Report
- as_dict() dict¶
Serializes the TableSpecificConfig into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the TableSpecificConfig into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) TableSpecificConfig¶
Deserializes the TableSpecificConfig from a dictionary.
- class databricks.sdk.service.pipelines.TableSpecificConfigScdType¶
The SCD type to use to ingest the table.
- APPEND_ONLY = "APPEND_ONLY"¶
- SCD_TYPE_1 = "SCD_TYPE_1"¶
- SCD_TYPE_2 = "SCD_TYPE_2"¶
- class databricks.sdk.service.pipelines.TikTokAdsOptions(data_level: TikTokAdsOptionsTikTokDataLevel | None = None, dimensions: List[str] | None = None, lookback_window_days: int | None = None, metrics: List[str] | None = None, query_lifetime: bool | None = None, report_type: TikTokAdsOptionsTikTokReportType | None = None, sync_start_date: str | None = None)¶
TikTok Ads specific options for ingestion
- data_level: TikTokAdsOptionsTikTokDataLevel | None = None¶
Deprecated. Use custom_report_options.data_level instead.
- dimensions: List[str] | None = None¶
Deprecated. Use custom_report_options.dimensions instead.
- lookback_window_days: int | None = None¶
(Optional) Number of days to look back for report tables during incremental sync to capture late-arriving conversions and attribution data.
- metrics: List[str] | None = None¶
Deprecated. Use custom_report_options.metrics instead.
- query_lifetime: bool | None = None¶
Deprecated. Use custom_report_options.query_lifetime instead.
- report_type: TikTokAdsOptionsTikTokReportType | None = None¶
Deprecated. Use custom_report_options.report_type instead.
- sync_start_date: str | None = None¶
(Optional) Start date for the initial sync of report tables in YYYY-MM-DD format. This determines the earliest date from which to sync historical data.
- as_dict() dict¶
Serializes the TikTokAdsOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the TikTokAdsOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) TikTokAdsOptions¶
Deserializes the TikTokAdsOptions from a dictionary.
- class databricks.sdk.service.pipelines.TikTokAdsOptionsTikTokDataLevel¶
Data level for TikTok Ads report aggregation.
- AUCTION_AD = "AUCTION_AD"¶
- AUCTION_ADGROUP = "AUCTION_ADGROUP"¶
- AUCTION_ADVERTISER = "AUCTION_ADVERTISER"¶
- AUCTION_CAMPAIGN = "AUCTION_CAMPAIGN"¶
- class databricks.sdk.service.pipelines.TikTokAdsOptionsTikTokReportType¶
Report type for TikTok Ads API.
- AUDIENCE = "AUDIENCE"¶
- BASIC = "BASIC"¶
- BUSINESS_CENTER = "BUSINESS_CENTER"¶
- DSA = "DSA"¶
- GMV_MAX = "GMV_MAX"¶
- PLAYABLE_AD = "PLAYABLE_AD"¶
- class databricks.sdk.service.pipelines.Transformer(format: TransformerFormat | None = None, json_options: JsonTransformerOptions | None = None)¶
Specifies how to transform binary data into structured data.
- format: TransformerFormat | None = None¶
Required: the wire format of the data.
- json_options: JsonTransformerOptions | None = None¶
- as_dict() dict¶
Serializes the Transformer into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Transformer into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) Transformer¶
Deserializes the Transformer from a dictionary.
- class databricks.sdk.service.pipelines.Truncation(truncated_fields: List[TruncationTruncationDetail] | None = None)¶
Information about truncations applied to this event.
- truncated_fields: List[TruncationTruncationDetail] | None = None¶
List of fields that were truncated from this event. If empty or absent, no truncation occurred.
- as_dict() dict¶
Serializes the Truncation into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the Truncation into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) Truncation¶
Deserializes the Truncation from a dictionary.
- class databricks.sdk.service.pipelines.TruncationTruncationDetail(field_name: str | None = None)¶
Details about a specific field that was truncated.
- field_name: str | None = None¶
The name of the truncated field (e.g., “error”). Corresponds to field names in PipelineEvent.
- as_dict() dict¶
Serializes the TruncationTruncationDetail into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the TruncationTruncationDetail into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) TruncationTruncationDetail¶
Deserializes the TruncationTruncationDetail from a dictionary.
- class databricks.sdk.service.pipelines.UpdateInfo(cause: 'Optional[UpdateInfoCause]' = None, cluster_id: 'Optional[str]' = None, config: 'Optional[PipelineSpec]' = None, creation_time: 'Optional[int]' = None, full_refresh: 'Optional[bool]' = None, full_refresh_selection: 'Optional[List[str]]' = None, parameters: 'Optional[Dict[str, str]]' = None, pipeline_id: 'Optional[str]' = None, refresh_selection: 'Optional[List[str]]' = None, state: 'Optional[UpdateInfoState]' = None, update_id: 'Optional[str]' = None, validate_only: 'Optional[bool]' = None)¶
- cause: UpdateInfoCause | None = None¶
What triggered this update.
- cluster_id: str | None = None¶
The ID of the cluster that the update is running on.
- config: PipelineSpec | None = None¶
The pipeline configuration with system defaults applied where unspecified by the user. Not returned by ListUpdates.
- creation_time: int | None = None¶
The time when this update was created.
- full_refresh: bool | None = None¶
If true, this update will reset all tables before running.
- full_refresh_selection: List[str] | None = None¶
A list of tables to update with fullRefresh. If both refresh_selection and full_refresh_selection are empty, this is a full graph update. Full Refresh on a table means that the states of the table will be reset before the refresh.
- parameters: Dict[str, str] | None = None¶
Key/value map of parameters used to initiate the update
- pipeline_id: str | None = None¶
The ID of the pipeline.
- refresh_selection: List[str] | None = None¶
A list of tables to update without fullRefresh. If both refresh_selection and full_refresh_selection are empty, this is a full graph update. Full Refresh on a table means that the states of the table will be reset before the refresh.
- state: UpdateInfoState | None = None¶
The update state.
- update_id: str | None = None¶
The ID of this update.
- validate_only: bool | None = None¶
If true, this update only validates the correctness of pipeline source code but does not materialize or publish any datasets.
- as_dict() dict¶
Serializes the UpdateInfo into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the UpdateInfo into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) UpdateInfo¶
Deserializes the UpdateInfo from a dictionary.
- class databricks.sdk.service.pipelines.UpdateInfoCause¶
What triggered this update.
- API_CALL = "API_CALL"¶
- INFRASTRUCTURE_MAINTENANCE = "INFRASTRUCTURE_MAINTENANCE"¶
- JOB_TASK = "JOB_TASK"¶
- RETRY_ON_FAILURE = "RETRY_ON_FAILURE"¶
- SCHEMA_CHANGE = "SCHEMA_CHANGE"¶
- SERVICE_UPGRADE = "SERVICE_UPGRADE"¶
- USER_ACTION = "USER_ACTION"¶
- class databricks.sdk.service.pipelines.UpdateInfoState¶
The update state.
- CANCELED = "CANCELED"¶
- COMPLETED = "COMPLETED"¶
- CREATED = "CREATED"¶
- FAILED = "FAILED"¶
- INITIALIZING = "INITIALIZING"¶
- QUEUED = "QUEUED"¶
- RESETTING = "RESETTING"¶
- RUNNING = "RUNNING"¶
- SETTING_UP_TABLES = "SETTING_UP_TABLES"¶
- STOPPING = "STOPPING"¶
- WAITING_FOR_RESOURCES = "WAITING_FOR_RESOURCES"¶
- class databricks.sdk.service.pipelines.UpdateStateInfo(creation_time: 'Optional[str]' = None, state: 'Optional[UpdateStateInfoState]' = None, update_id: 'Optional[str]' = None)¶
- creation_time: str | None = None¶
- state: UpdateStateInfoState | None = None¶
- update_id: str | None = None¶
- as_dict() dict¶
Serializes the UpdateStateInfo into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the UpdateStateInfo into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) UpdateStateInfo¶
Deserializes the UpdateStateInfo from a dictionary.
- class databricks.sdk.service.pipelines.UpdateStateInfoState¶
The update state.
- CANCELED = "CANCELED"¶
- COMPLETED = "COMPLETED"¶
- CREATED = "CREATED"¶
- FAILED = "FAILED"¶
- INITIALIZING = "INITIALIZING"¶
- QUEUED = "QUEUED"¶
- RESETTING = "RESETTING"¶
- RUNNING = "RUNNING"¶
- SETTING_UP_TABLES = "SETTING_UP_TABLES"¶
- STOPPING = "STOPPING"¶
- WAITING_FOR_RESOURCES = "WAITING_FOR_RESOURCES"¶
- class databricks.sdk.service.pipelines.ZendeskSupportOptions(start_date: str | None = None)¶
Zendesk Support specific options for ingestion
- start_date: str | None = None¶
(Optional) Start date in YYYY-MM-DD format for the initial sync. This determines the earliest date from which to sync historical data.
- as_dict() dict¶
Serializes the ZendeskSupportOptions into a dictionary suitable for use as a JSON request body.
- as_shallow_dict() dict¶
Serializes the ZendeskSupportOptions into a shallow dictionary of its immediate attributes.
- classmethod from_dict(d: Dict[str, Any]) ZendeskSupportOptions¶
Deserializes the ZendeskSupportOptions from a dictionary.