Delta Live Tables API guide
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
Important
To access Databricks REST APIs, you must authenticate.
Create a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Creates a new Delta Live Tables pipeline.
Example
This example creates a new triggered pipeline.
Request
curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines \
--data @pipeline-settings.json
pipeline-settings.json
:
{
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"continuous": false
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Request structure
See PipelineSettings.
Edit a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Updates the settings for an existing pipeline.
Example
This example adds a target
parameter to the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc --request PUT \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5 \
> --data @pipeline-settings.json
pipeline-settings.json
{
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"continuous": false
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Request structure
See PipelineSettings.
Delete a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Deletes a pipeline from the Delta Live Tables system.
Example
This example deletes the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc --request DELETE \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Start a pipeline update
Endpoint |
HTTP Method |
---|---|
|
|
Starts an update for a pipeline.
Example
This example starts an update with full refresh for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates \
--data '{ "full_refresh": "true" }'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Stop any active pipeline update
Endpoint |
HTTP Method |
---|---|
|
|
Stops any active pipeline update. If no update is running, this request is a no-op.
Example
This example stops an update for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/stop
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
List pipeline events
Endpoint |
HTTP Method |
---|---|
|
|
Retrieves events for a pipeline.
Example
This example retrieves a maximum of 5 events for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
.
Request
curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/events \
--data '{"max_results": 5}'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Request structure
Field Name |
Type |
Description |
---|---|---|
page_token |
|
Page token returned by previous call. This field is mutually exclusive with all fields in this request except max_results. An error is returned if any fields other than max_results are set when this field is set. This field is optional. |
max_results |
|
The maximum number of entries to return in a single page. The
system may return fewer than This field is optional. The default value is 25. The maximum value is 100. An error is returned if the value of
|
order_by |
|
A string indicating a sort order by timestamp for the results,
for example, The sort order can be ascending or descending. By default, events are returned in descending order by timestamp. This field is optional. |
filter |
|
Criteria to select a subset of results, expressed using a SQL-like syntax. The supported filters are:
Composite expressions are supported, for example:
This field is optional. |
Get pipeline details
Endpoint |
HTTP Method |
---|---|
|
|
Gets details about a pipeline, including the pipeline settings and recent updates.
Example
This example gets details for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Response
{
"pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"spec": {
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"continuous": false
},
"state": "IDLE",
"cluster_id": "1234-567891-abcde123",
"name": "Wikipedia pipeline (SQL)",
"creator_user_name": "username",
"latest_updates": [
{
"update_id": "8a0b6d02-fbd0-11eb-9a03-0242ac130003",
"state": "COMPLETED",
"creation_time": "2021-08-13T00:37:30.279Z"
},
{
"update_id": "a72c08ba-fbd0-11eb-9a03-0242ac130003",
"state": "CANCELED",
"creation_time": "2021-08-13T00:35:51.902Z"
},
{
"update_id": "ac37d924-fbd0-11eb-9a03-0242ac130003",
"state": "FAILED",
"creation_time": "2021-08-13T00:33:38.565Z"
}
],
"run_as_user_name": "username"
}
Response structure
Field Name |
Type |
Description |
---|---|---|
pipeline_id |
|
The unique identifier of the pipeline. |
spec |
The pipeline settings. |
|
state |
|
The state of the pipeline. One of If state = |
cluster_id |
|
The identifier of the cluster running the pipeline. |
name |
|
The user-friendly name for this pipeline. |
creator_user_name |
|
The username of the pipeline creator. |
latest_updates |
An array of UpdateStateInfo |
Status of the most recent updates for the pipeline, ordered with the newest update first. |
run_as_user_name |
|
The username that the pipeline runs as. |
Get update details
Endpoint |
HTTP Method |
---|---|
|
|
Gets details for a pipeline update.
Example
This example gets details for update 9a84f906-fc51-11eb-9a03-0242ac130003
for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates/9a84f906-fc51-11eb-9a03-0242ac130003
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Response
{
"update": {
"pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"update_id": "9a84f906-fc51-11eb-9a03-0242ac130003",
"config": {
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"configuration": {
"pipelines.numStreamRetryAttempts": "5"
},
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"filters": {},
"email_notifications": {},
"continuous": false,
"development": false
},
"cause": "API_CALL",
"state": "COMPLETED",
"creation_time": 1628815050279,
"full_refresh": true
}
}
Response structure
Field Name |
Type |
Description |
---|---|---|
pipeline_id |
|
The unique identifier of the pipeline. |
update_id |
|
The unique identifier of this update. |
config |
The pipeline settings. |
|
cause |
|
The trigger for the update. One of |
state |
|
The state of the update. One of |
cluster_id |
|
The identifier of the cluster running the pipeline. |
creation_time |
|
The timestamp when the update was created. |
full_refresh |
|
Whether the update was triggered to perform a full refresh. If true, all pipeline tables were reset before running the update. |
List pipelines
Endpoint |
HTTP Method |
---|---|
|
|
Lists pipelines defined in the Delta Live Tables system.
Example
This example retrieves details for up to two pipelines, starting from a specified page_token
:
Request
curl -n -X GET https://<databricks-instance>/api/2.0/pipelines \
--data '{ "page_token": "eyJ...==", "max_results": 2 }'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
This example uses a .netrc file.
Response
{
"statuses": [
{
"pipeline_id": "e0f01758-fc61-11eb-9a03-0242ac130003",
"state": "IDLE",
"name": "dlt-pipeline-python",
"latest_updates": [
{
"update_id": "ee9ae73e-fc61-11eb-9a03-0242ac130003",
"state": "COMPLETED",
"creation_time": "2021-08-13T00:34:21.871Z"
}
],
"creator_user_name": "username"
},
{
"pipeline_id": "f4c82f5e-fc61-11eb-9a03-0242ac130003",
"state": "IDLE",
"name": "dlt-pipeline-python",
"creator_user_name": "username"
}
],
"next_page_token": "eyJ...==",
"prev_page_token": "eyJ..x9"
}
Request structure
Field Name |
Type |
Description |
---|---|---|
page_token |
|
Page token returned by previous call. This field is optional. |
max_results |
|
The maximum number of entries to return in a single page. The
system may return fewer than This field is optional. The default value is 25. The maximum value is 100. An error is returned if the value of
|
order_by |
An array of |
A list of strings specifying the order of results, for example,
This field is optional. |
filter |
|
Select a subset of results based on the specified criteria. The supported filters are:
Composite filters are not supported. This field is optional. |
Response structure
Field Name |
Type |
Description |
---|---|---|
statuses |
An array of PipelineStateInfo |
The list of events matching the request criteria. |
next_page_token |
|
If present, a token to fetch the next page of events. |
prev_page_token |
|
If present, a token to fetch the previous page of events. |
Data structures
In this section:
NotebookLibrary
Field Name |
Type |
Description |
---|---|---|
path |
|
The absolute path to the notebook. This field is required. |
PipelineLibrary
Field Name |
Type |
Description |
---|---|---|
notebook |
The path to a notebook defining Delta Live Tables datasets. The path must be in the
Databricks workspace, for example:
|
PipelineSettings
Specification for a pipeline deployment.
Field Name |
Type |
Description |
---|---|---|
id |
|
The unique identifier for this pipeline. The identifier is created by the Delta Live Tables system, and must not be provided when creating a pipeline. |
name |
|
A user-friendly name for this pipeline. This field is optional. By default, the pipeline name must be unique. To use a
duplicate name, set |
storage |
|
A path to a DBFS directory for storing checkpoints and tables created by the pipeline. This field is optional. The system uses a default location if this field is empty. |
configuration |
A map of |
A list of key-value pairs to add to the Spark configuration of the cluster that will run the pipeline. This field is optional. Elements must be formatted as key:value pairs. |
clusters |
An array of PipelinesNewCluster |
An array of specifications for the clusters to run the pipeline. This field is optional. If this is not specified, the system will select a default cluster configuration for the pipeline. |
libraries |
An array of PipelineLibrary |
The notebooks containing the pipeline code and any dependencies required to run the pipeline. |
target |
|
A database name for persisting pipeline output data. See Delta Live Tables data publishing for more information. |
continuous |
|
Whether this is a continuous pipeline. This field is optional. The default value is |
development |
|
Whether to run the pipeline in development mode. This field is optional. The default value is |
PipelineStateInfo
Field Name |
Type |
Description |
---|---|---|
state |
|
The state of the pipeline. One of |
pipeline_id |
|
The unique identifier of the pipeline. |
cluster_id |
|
The unique identifier of the cluster running the pipeline. |
name |
|
The user-friendly name of the pipeline. |
latest_updates |
An array of UpdateStateInfo |
Status of the most recent updates for the pipeline, ordered with the newest update first. |
creator_user_name |
|
The username of the pipeline creator. |
run_as_user_name |
|
The username that the pipeline runs as. This is a read only value derived from the pipeline owner. |
PipelinesNewCluster
A pipeline cluster specification.
Field Name |
Type |
Description |
---|---|---|
label |
|
A label for the cluster specification, either
This field is optional. The default value is
|
attrs |
Optional attributes to set during cluster creation. These attributes cannot be changed over the lifetime of a cluster. The Delta Live Tables system sets the following attributes. These attributes cannot be configured by users:
|
|
size |
Optional cluster size specification. Size can either be a constant number of workers or autoscaling parameters. |
|
apply_policy_default_values |
|
Whether to use policy default values for missing cluster attributes. |
UpdateStateInfo
Field Name |
Type |
Description |
---|---|---|
update_id |
|
The unique identifier for this update. |
state |
|
The state of the update. One of |
creation_time |
|
Timestamp when this update was created. |