# Long-running operations When you invoke a long-running operation, the SDK provides a high-level API to _trigger_ these operations and _wait_ for the related entities to reach the correct state or return the error message in case of failure. All long-running operations return generic `Wait` instance with `result()` method to get a result of long-running operation, once it's finished. Databricks SDK for Python picks the most reasonable default timeouts for every method, but sometimes you may find yourself in a situation, where you'd want to provide `datetime.timedelta()` as the value of `timeout` argument to `result()` method. There are a number of long-running operations in Databricks APIs such as managing: * Clusters, * Command execution * Jobs * Libraries * Delta Live Tables pipelines * Databricks SQL warehouses. For example, in the Clusters API, once you create a cluster, you receive a cluster ID, and the cluster is in the `PENDING` state Meanwhile Databricks takes care of provisioning virtual machines from the cloud provider in the background. The cluster is only usable in the `RUNNING` state and so you have to wait for that state to be reached. Another example is the API for running a job or repairing the run: right after the run starts, the run is in the `PENDING` state. The job is only considered to be finished when it is in either the `TERMINATED` or `SKIPPED` state. Also you would likely need the error message if the long-running operation times out and fails with an error code. Other times you may want to configure a custom timeout other than the default of 20 minutes. In the following example, `w.clusters.create` returns `ClusterInfo` only once the cluster is in the `RUNNING` state, otherwise it will timeout in 10 minutes: ```python import datetime import logging from databricks.sdk import WorkspaceClient w = WorkspaceClient() info = w.clusters.create(cluster_name='Created cluster', spark_version='12.0.x-scala2.12', node_type_id='m5d.large', autotermination_minutes=10, num_workers=1).result(timeout=datetime.timedelta(minutes=10)) logging.info(f'Created: {info}') ``` Please look at the `examples/starting_job_and_waiting.py` for a more advanced usage: ```python import datetime import logging import time from databricks.sdk import WorkspaceClient import databricks.sdk.service.jobs as j w = WorkspaceClient() # create a dummy file on DBFS that just sleeps for 10 seconds py_on_dbfs = f'/home/{w.current_user.me().user_name}/sample.py' with w.dbfs.open(py_on_dbfs, write=True, overwrite=True) as f: f.write(b'import time; time.sleep(10); print("Hello, World!")') # trigger one-time-run job and get waiter object waiter = w.jobs.submit(run_name=f'py-sdk-run-{time.time()}', tasks=[ j.RunSubmitTaskSettings( task_key='hello_world', new_cluster=j.BaseClusterInfo( spark_version=w.clusters.select_spark_version(long_term_support=True), node_type_id=w.clusters.select_node_type(local_disk=True), num_workers=1 ), spark_python_task=j.SparkPythonTask( python_file=f'dbfs:{py_on_dbfs}' ), ) ]) logging.info(f'starting to poll: {waiter.run_id}') # callback, that receives a polled entity between state updates def print_status(run: j.Run): statuses = [f'{t.task_key}: {t.state.life_cycle_state}' for t in run.tasks] logging.info(f'workflow intermediate status: {", ".join(statuses)}') # If you want to perform polling in a separate thread, process, or service, # you can use w.jobs.wait_get_run_job_terminated_or_skipped( # run_id=waiter.run_id, # timeout=datetime.timedelta(minutes=15), # callback=print_status) to achieve the same results. # # Waiter interface allows for `w.jobs.submit(..).result()` simplicity in # the scenarios, where you need to block the calling thread for the job to finish. run = waiter.result(timeout=datetime.timedelta(minutes=15), callback=print_status) logging.info(f'job finished: {run.run_page_url}') ```