Vector Search¶

These dataclasses are used in the SDK to represent API requests and responses for services in the databricks.sdk.service.vectorsearch module.

class databricks.sdk.service.vectorsearch.ColumnInfo¶

name: str | None = None¶: Name of the column.

as_dict() → dict¶: Serializes the ColumnInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ColumnInfo¶: Deserializes the ColumnInfo from a dictionary.

class databricks.sdk.service.vectorsearch.CreateEndpoint¶

name: str¶: Name of endpoint

endpoint_type: EndpointType¶: Type of endpoint.

as_dict() → dict¶: Serializes the CreateEndpoint into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateEndpoint¶: Deserializes the CreateEndpoint from a dictionary.

class databricks.sdk.service.vectorsearch.CreateVectorIndexRequest¶

name: str¶: Name of the index

endpoint_name: str¶: Name of the endpoint to be used for serving the index

primary_key: str¶: Primary key of the index

index_type: VectorIndexType¶

There are 2 types of Vector Search indexes:

DELTA_SYNC: An index that automatically syncs with a source Delta Table, automatically and

incrementally updating the index as the underlying data in the Delta Table changes. - DIRECT_ACCESS: An index that supports direct read and write of vectors and metadata through our REST and SDK APIs. With this model, the user manages index updates.

delta_sync_index_spec: DeltaSyncVectorIndexSpecRequest | None = None¶: Specification for Delta Sync Index. Required if index_type is DELTA_SYNC.

direct_access_index_spec: DirectAccessVectorIndexSpec | None = None¶: Specification for Direct Vector Access Index. Required if index_type is DIRECT_ACCESS.

as_dict() → dict¶: Serializes the CreateVectorIndexRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateVectorIndexRequest¶: Deserializes the CreateVectorIndexRequest from a dictionary.

class databricks.sdk.service.vectorsearch.CreateVectorIndexResponse¶

vector_index: VectorIndex | None = None¶

as_dict() → dict¶: Serializes the CreateVectorIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → CreateVectorIndexResponse¶: Deserializes the CreateVectorIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.DeleteDataResult¶

Result of the upsert or delete operation.

failed_primary_keys: List[str] | None = None¶: List of primary keys for rows that failed to process.

success_row_count: int | None = None¶: Count of successfully processed rows.

as_dict() → dict¶: Serializes the DeleteDataResult into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteDataResult¶: Deserializes the DeleteDataResult from a dictionary.

class databricks.sdk.service.vectorsearch.DeleteDataStatus¶

Status of the delete operation.

FAILURE = "FAILURE"¶

PARTIAL_SUCCESS = "PARTIAL_SUCCESS"¶

SUCCESS = "SUCCESS"¶

class databricks.sdk.service.vectorsearch.DeleteDataVectorIndexRequest¶

Request payload for deleting data from a vector index.

primary_keys: List[str]¶: List of primary keys for the data to be deleted.

index_name: str | None = None¶: Name of the vector index where data is to be deleted. Must be a Direct Vector Access Index.

as_dict() → dict¶: Serializes the DeleteDataVectorIndexRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteDataVectorIndexRequest¶: Deserializes the DeleteDataVectorIndexRequest from a dictionary.

class databricks.sdk.service.vectorsearch.DeleteDataVectorIndexResponse¶

Response to a delete data vector index request.

result: DeleteDataResult | None = None¶: Result of the upsert or delete operation.

status: DeleteDataStatus | None = None¶: Status of the delete operation.

as_dict() → dict¶: Serializes the DeleteDataVectorIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteDataVectorIndexResponse¶: Deserializes the DeleteDataVectorIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.DeleteEndpointResponse¶

as_dict() → dict¶: Serializes the DeleteEndpointResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteEndpointResponse¶: Deserializes the DeleteEndpointResponse from a dictionary.

class databricks.sdk.service.vectorsearch.DeleteIndexResponse¶

as_dict() → dict¶: Serializes the DeleteIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeleteIndexResponse¶: Deserializes the DeleteIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.DeltaSyncVectorIndexSpecRequest¶

embedding_source_columns: List[EmbeddingSourceColumn] | None = None¶: The columns that contain the embedding source.

embedding_vector_columns: List[EmbeddingVectorColumn] | None = None¶: The columns that contain the embedding vectors.

pipeline_type: PipelineType | None = None¶

Pipeline execution mode.

TRIGGERED: If the pipeline uses the triggered execution mode, the system stops processing

after successfully refreshing the source table in the pipeline once, ensuring the table is updated based on the data available when the update started. - CONTINUOUS: If the pipeline uses continuous execution, the pipeline processes new data as it arrives in the source table to keep vector index fresh.

source_table: str | None = None¶: The name of the source table.

as_dict() → dict¶: Serializes the DeltaSyncVectorIndexSpecRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeltaSyncVectorIndexSpecRequest¶: Deserializes the DeltaSyncVectorIndexSpecRequest from a dictionary.

class databricks.sdk.service.vectorsearch.DeltaSyncVectorIndexSpecResponse¶

embedding_source_columns: List[EmbeddingSourceColumn] | None = None¶: The columns that contain the embedding source.

embedding_vector_columns: List[EmbeddingVectorColumn] | None = None¶: The columns that contain the embedding vectors.

pipeline_id: str | None = None¶: The ID of the pipeline that is used to sync the index.

pipeline_type: PipelineType | None = None¶

Pipeline execution mode.

TRIGGERED: If the pipeline uses the triggered execution mode, the system stops processing

after successfully refreshing the source table in the pipeline once, ensuring the table is updated based on the data available when the update started. - CONTINUOUS: If the pipeline uses continuous execution, the pipeline processes new data as it arrives in the source table to keep vector index fresh.

source_table: str | None = None¶: The name of the source table.

as_dict() → dict¶: Serializes the DeltaSyncVectorIndexSpecResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DeltaSyncVectorIndexSpecResponse¶: Deserializes the DeltaSyncVectorIndexSpecResponse from a dictionary.

class databricks.sdk.service.vectorsearch.DirectAccessVectorIndexSpec¶

embedding_source_columns: List[EmbeddingSourceColumn] | None = None¶: Contains the optional model endpoint to use during query time.

embedding_vector_columns: List[EmbeddingVectorColumn] | None = None¶

schema_json: str | None = None¶

The schema of the index in JSON format.

Supported types are integer, long, float, double, boolean, string, date, timestamp.

Supported types for vector column: array<float>, array<double>,`.

as_dict() → dict¶: Serializes the DirectAccessVectorIndexSpec into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → DirectAccessVectorIndexSpec¶: Deserializes the DirectAccessVectorIndexSpec from a dictionary.

class databricks.sdk.service.vectorsearch.EmbeddingSourceColumn¶

embedding_model_endpoint_name: str | None = None¶: Name of the embedding model endpoint

name: str | None = None¶: Name of the column

as_dict() → dict¶: Serializes the EmbeddingSourceColumn into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EmbeddingSourceColumn¶: Deserializes the EmbeddingSourceColumn from a dictionary.

class databricks.sdk.service.vectorsearch.EmbeddingVectorColumn¶

embedding_dimension: int | None = None¶: Dimension of the embedding vector

name: str | None = None¶: Name of the column

as_dict() → dict¶: Serializes the EmbeddingVectorColumn into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EmbeddingVectorColumn¶: Deserializes the EmbeddingVectorColumn from a dictionary.

class databricks.sdk.service.vectorsearch.EndpointInfo¶

creation_timestamp: int | None = None¶: Timestamp of endpoint creation

creator: str | None = None¶: Creator of the endpoint

endpoint_status: EndpointStatus | None = None¶: Current status of the endpoint

endpoint_type: EndpointType | None = None¶: Type of endpoint.

id: str | None = None¶: Unique identifier of the endpoint

last_updated_timestamp: int | None = None¶: Timestamp of last update to the endpoint

last_updated_user: str | None = None¶: User who last updated the endpoint

name: str | None = None¶: Name of endpoint

num_indexes: int | None = None¶: Number of indexes on the endpoint

as_dict() → dict¶: Serializes the EndpointInfo into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointInfo¶: Deserializes the EndpointInfo from a dictionary.

class databricks.sdk.service.vectorsearch.EndpointStatus¶

Status information of an endpoint

message: str | None = None¶: Additional status message

state: EndpointStatusState | None = None¶: Current state of the endpoint

as_dict() → dict¶: Serializes the EndpointStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → EndpointStatus¶: Deserializes the EndpointStatus from a dictionary.

class databricks.sdk.service.vectorsearch.EndpointStatusState¶

Current state of the endpoint

OFFLINE = "OFFLINE"¶

ONLINE = "ONLINE"¶

PROVISIONING = "PROVISIONING"¶

class databricks.sdk.service.vectorsearch.EndpointType¶

Type of endpoint.

STANDARD = "STANDARD"¶

class databricks.sdk.service.vectorsearch.ListEndpointResponse¶

endpoints: List[EndpointInfo] | None = None¶: An array of Endpoint objects

next_page_token: str | None = None¶: A token that can be used to get the next page of results. If not present, there are no more results to show.

as_dict() → dict¶: Serializes the ListEndpointResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ListEndpointResponse¶: Deserializes the ListEndpointResponse from a dictionary.

class databricks.sdk.service.vectorsearch.ListVectorIndexesResponse¶

next_page_token: str | None = None¶: A token that can be used to get the next page of results. If not present, there are no more results to show.

vector_indexes: List[MiniVectorIndex] | None = None¶

as_dict() → dict¶: Serializes the ListVectorIndexesResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ListVectorIndexesResponse¶: Deserializes the ListVectorIndexesResponse from a dictionary.

class databricks.sdk.service.vectorsearch.MiniVectorIndex¶

creator: str | None = None¶: The user who created the index.

endpoint_name: str | None = None¶: Name of the endpoint associated with the index

index_type: VectorIndexType | None = None¶

There are 2 types of Vector Search indexes:

DELTA_SYNC: An index that automatically syncs with a source Delta Table, automatically and

incrementally updating the index as the underlying data in the Delta Table changes. - DIRECT_ACCESS: An index that supports direct read and write of vectors and metadata through our REST and SDK APIs. With this model, the user manages index updates.

name: str | None = None¶: Name of the index

primary_key: str | None = None¶: Primary key of the index

as_dict() → dict¶: Serializes the MiniVectorIndex into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → MiniVectorIndex¶: Deserializes the MiniVectorIndex from a dictionary.

class databricks.sdk.service.vectorsearch.PipelineType¶

Pipeline execution mode. - TRIGGERED: If the pipeline uses the triggered execution mode, the system stops processing after successfully refreshing the source table in the pipeline once, ensuring the table is updated based on the data available when the update started. - CONTINUOUS: If the pipeline uses continuous execution, the pipeline processes new data as it arrives in the source table to keep vector index fresh.

CONTINUOUS = "CONTINUOUS"¶

TRIGGERED = "TRIGGERED"¶

class databricks.sdk.service.vectorsearch.QueryVectorIndexRequest¶

columns: List[str]¶: List of column names to include in the response.

filters_json: str | None = None¶

JSON string representing query filters.

Example filters: - {“id <”: 5}: Filter for id less than 5. - {“id >”: 5}: Filter for id greater than 5. - {“id <=”: 5}: Filter for id less than equal to 5. - {“id >=”: 5}: Filter for id greater than equal to 5. - {“id”: 5}: Filter for id equal to 5.

index_name: str | None = None¶: Name of the vector index to query.

num_results: int | None = None¶: Number of results to return. Defaults to 10.

query_text: str | None = None¶: Query text. Required for Delta Sync Index using model endpoint.

query_vector: List[float] | None = None¶: Query vector. Required for Direct Vector Access Index and Delta Sync Index using self-managed vectors.

score_threshold: float | None = None¶: Threshold for the approximate nearest neighbor search. Defaults to 0.0.

as_dict() → dict¶: Serializes the QueryVectorIndexRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → QueryVectorIndexRequest¶: Deserializes the QueryVectorIndexRequest from a dictionary.

class databricks.sdk.service.vectorsearch.QueryVectorIndexResponse¶

manifest: ResultManifest | None = None¶: Metadata about the result set.

result: ResultData | None = None¶: Data returned in the query result.

as_dict() → dict¶: Serializes the QueryVectorIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → QueryVectorIndexResponse¶: Deserializes the QueryVectorIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.ResultData¶

Data returned in the query result.

data_array: List[List[str]] | None = None¶: Data rows returned in the query.

row_count: int | None = None¶: Number of rows in the result set.

as_dict() → dict¶: Serializes the ResultData into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ResultData¶: Deserializes the ResultData from a dictionary.

class databricks.sdk.service.vectorsearch.ResultManifest¶

Metadata about the result set.

column_count: int | None = None¶: Number of columns in the result set.

columns: List[ColumnInfo] | None = None¶: Information about each column in the result set.

as_dict() → dict¶: Serializes the ResultManifest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → ResultManifest¶: Deserializes the ResultManifest from a dictionary.

class databricks.sdk.service.vectorsearch.SyncIndexResponse¶

as_dict() → dict¶: Serializes the SyncIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → SyncIndexResponse¶: Deserializes the SyncIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.UpsertDataResult¶

Result of the upsert or delete operation.

failed_primary_keys: List[str] | None = None¶: List of primary keys for rows that failed to process.

success_row_count: int | None = None¶: Count of successfully processed rows.

as_dict() → dict¶: Serializes the UpsertDataResult into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → UpsertDataResult¶: Deserializes the UpsertDataResult from a dictionary.

class databricks.sdk.service.vectorsearch.UpsertDataStatus¶

Status of the upsert operation.

FAILURE = "FAILURE"¶

PARTIAL_SUCCESS = "PARTIAL_SUCCESS"¶

SUCCESS = "SUCCESS"¶

class databricks.sdk.service.vectorsearch.UpsertDataVectorIndexRequest¶

Request payload for upserting data into a vector index.

inputs_json: str¶: JSON string representing the data to be upserted.

index_name: str | None = None¶: Name of the vector index where data is to be upserted. Must be a Direct Vector Access Index.

as_dict() → dict¶: Serializes the UpsertDataVectorIndexRequest into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → UpsertDataVectorIndexRequest¶: Deserializes the UpsertDataVectorIndexRequest from a dictionary.

class databricks.sdk.service.vectorsearch.UpsertDataVectorIndexResponse¶

Response to an upsert data vector index request.

result: UpsertDataResult | None = None¶: Result of the upsert or delete operation.

status: UpsertDataStatus | None = None¶: Status of the upsert operation.

as_dict() → dict¶: Serializes the UpsertDataVectorIndexResponse into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → UpsertDataVectorIndexResponse¶: Deserializes the UpsertDataVectorIndexResponse from a dictionary.

class databricks.sdk.service.vectorsearch.VectorIndex¶

creator: str | None = None¶: The user who created the index.

delta_sync_index_spec: DeltaSyncVectorIndexSpecResponse | None = None¶

direct_access_index_spec: DirectAccessVectorIndexSpec | None = None¶

endpoint_name: str | None = None¶: Name of the endpoint associated with the index

index_type: VectorIndexType | None = None¶

There are 2 types of Vector Search indexes:

DELTA_SYNC: An index that automatically syncs with a source Delta Table, automatically and

incrementally updating the index as the underlying data in the Delta Table changes. - DIRECT_ACCESS: An index that supports direct read and write of vectors and metadata through our REST and SDK APIs. With this model, the user manages index updates.

name: str | None = None¶: Name of the index

primary_key: str | None = None¶: Primary key of the index

status: VectorIndexStatus | None = None¶

as_dict() → dict¶: Serializes the VectorIndex into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → VectorIndex¶: Deserializes the VectorIndex from a dictionary.

class databricks.sdk.service.vectorsearch.VectorIndexStatus¶

index_url: str | None = None¶: Index API Url to be used to perform operations on the index

indexed_row_count: int | None = None¶: Number of rows indexed

message: str | None = None¶: Message associated with the index status

ready: bool | None = None¶: Whether the index is ready for search

as_dict() → dict¶: Serializes the VectorIndexStatus into a dictionary suitable for use as a JSON request body.

classmethod from_dict(d: Dict[str, any]) → VectorIndexStatus¶: Deserializes the VectorIndexStatus from a dictionary.

class databricks.sdk.service.vectorsearch.VectorIndexType¶

There are 2 types of Vector Search indexes: - DELTA_SYNC: An index that automatically syncs with a source Delta Table, automatically and incrementally updating the index as the underlying data in the Delta Table changes. - DIRECT_ACCESS: An index that supports direct read and write of vectors and metadata through our REST and SDK APIs. With this model, the user manages index updates.

DELTA_SYNC = "DELTA_SYNC"¶

DIRECT_ACCESS = "DIRECT_ACCESS"¶

Navigation

Related Topics

Vector Search¶