Databricks Asset Bundle project templates
This article describes the syntax for Databricks Asset Bundle templates. Bundles enable programmatic management of Databricks workflows. See What are Databricks Asset Bundles?
Bundle templates enable users to create bundles in a consistent, repeatable way, by establishing folder structures, build steps and tasks, tests, and other DevOps infrastructure-as-code (IaC) attributes common across a development environment deployment pipeline.
For example, if you routinely run Databricks jobs that require custom packages with a time-consuming compilation step upon installation, you can speed up your development loop by creating a bundle template that supports custom container environments.
Bundle templates define the directory structure of the bundle that will be created, and they include a databricks.yml.tmpl
configuration file template as well as a databricks_template_schema.json
file containing user-prompt variables.
Use a default bundle template
To use a Databricks default bundle template to create your bundle, use the Databricks CLI bundle init
command, specifying the name of the default template to use. For example, the following command creates a bundle using the default Python bundle template:
databricks bundle init default-python
If you do not specify a default template, the bundle init
command presents the set of available templates from which you can choose.
Databricks provides the following default bundle templates:
Template |
Description |
---|---|
|
A template for using Python with Databricks. This template creates a bundle with a job and Delta Live Tables pipeline. See default-python. |
|
A template for using SQL with Databricks. This template contains a configuration file that defines a job that runs SQL queries on a SQL warehouse. See default-sql. |
|
A template which leverages dbt-core for local development and bundles for deployment. This template contains the configuration that defines a job with a dbt task, as well as a configuration file that defines dbt profiles for deployed dbt jobs. See dbt-sql. |
|
An advanced full stack template for starting new MLOps Stacks projects. See mlops-stacks and Databricks Asset Bundles for MLOps Stacks. |
Use a custom bundle template
To use a bundle template other than the Databricks default bundle templates, pass the local path or remote URL of the template to the Databricks CLI bundle init
command.
For example, the following command uses the dab-container-template
template created in the Custom Bundle Template Tutorial:
databricks bundle init /projects/my-custom-bundle-templates/dab-container-template
Create a custom bundle template
Bundle templates use Go package templating syntax. See the Go package template documentation.
At a minimum, a bundle template project must have:
A
databricks_template_schema.json
file at the project root that defines one user-prompt variable for the bundle project name.A
databricks.yml.tmpl
file located in atemplate
folder that defines configuration for any bundles created with the template. If yourdatabricks.yml.tmpl
file references any additional*.yml.tmpl
configuration templates, specify the location of these in theinclude
mapping.
You can optionally add sub-folders and files to the template
folder that you want mirrored in bundles created by the template.
Define user prompt variables
The first step in building a basic bundle template is to create a template project folder and a file named databricks_template_schema.json
in the project root. This file contains the variables that users provide input values for when they use the template to create a bundle using bundle init
. This file’s format follows the JSON Schema Specification.
mkdir basic-bundle-template
touch basic-bundle-template/databricks_template_schema.json
Add the following to the databricks_template_schema.json
file, and then save the file:
{
"properties": {
"project_name": {
"type": "string",
"default": "basic_bundle",
"description": "What is the name of the bundle you want to create?",
"order": 1
}
},
"success_message": "\nYour bundle '{{.project_name}}' has been created."
}
In this file:
project_name
is the only input variable name.default
is an optional default value if a value is not provided by the user with--config-file
as part of thebundle init
command, or overridden by the user at the command prompt.description
is the user prompt associated with the input variable, if a value is not provided by the user with--config-file
as part of thebundle init
command.order
is an optional order in which each user prompt appears if a value is not provided by the user with--config-file
as part of thebundle init
command. Iforder
is not provided, then user prompts display in the order in which they are listed in the schema.success_message
is an optional message that is displayed upon successful project creation.
Build the folder structure
Next, create the required template
folder and build the folder structure within it. This structure will be mirrored by bundles created with this template. Also, put any files that you want included into those folders. This basic bundle template stores files in a src
folder and includes one simple notebook.
mkdir -p basic-bundle-template/template/src
touch basic-bundle-template/template/src/simple_notebook.ipynb
Add the following to the simple_notebook.ipynb
file:
print("Hello World!")
Populate configuration template files
Now create the required databricks.yml.tmpl
file in the template
folder:
touch basic-bundle-template/template/databricks.yml.tmpl
Populate this file with the basic configuration template YAML. This configuration template establishes the bundle name, one job using the specified notebook file, and two target environments for bundles created using this template. It also takes advantage of bundle substitutions, which is highly recommended. See bundle substitutions.
# This is the configuration for the Databricks Asset Bundle {{.project_name}}.
bundle:
name: {{.project_name}}
# The main job for {{.project_name}}
resources:
jobs:
{{.project_name}}_job:
name: {{.project_name}}_job
tasks:
- task_key: notebook_task
job_cluster_key: job_cluster
notebook_task:
notebook_path: ../src/simple_notebook.ipynb
job_clusters:
- job_cluster_key: job_cluster
new_cluster:
node_type_id: i3.xlarge
spark_version: 13.3.x-scala2.12
targets:
# The deployment targets. See https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
dev:
mode: development
default: true
workspace:
host: {{workspace_host}}
prod:
mode: production
workspace:
host: {{workspace_host}}
root_path: /Shared/.bundle/prod/${bundle.name}
{{- if not is_service_principal}}
run_as:
# This runs as {{user_name}} in production. Alternatively,
# a service principal could be used here using service_principal_name
user_name: {{user_name}}
{{end -}}
Test the bundle template
Finally, test your template. Create a new bundle project folder, then use the Databricks CLI to initialize a new bundle using the template:
mkdir my-test-bundle
cd my-test-bundle
databricks bundle init ../basic-bundle-template
For the prompt, What is your bundle project name?
, type my_test_bundle
.
Once the test bundle is created, the success message from the schema file is output. If you examine the contents of the my-test-bundle
folder, you should see the following:
my-test-bundle
├── databricks.yml
└── src
└── simple_notebook.ipynb
And the databricks.yml file is now customized:
# This is the configuration for the Databricks Asset Bundle my-test-bundle.
bundle:
name: my-test-bundle
# The main job for my-test-bundle
resources:
jobs:
my-test-bundle_job:
name: my-test-bundle_job
tasks:
- task_key: notebook_task
job_cluster_key: job_cluster
notebook_task:
notebook_path: ../src/simple_notebook.ipynb
job_clusters:
- job_cluster_key: job_cluster
new_cluster:
node_type_id: i3.xlarge
spark_version: 13.3.x-scala2.12
targets:
# The 'dev' target, used for development purposes. See [_](https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html#development-mode)
dev:
mode: development
default: true
workspace:
host: https://my-host.cloud.databricks.com
# The 'prod' target, used for production deployment. See [_](https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html#production-mode)
prod:
mode: production
workspace:
host: https://my-host.cloud.databricks.com
root_path: /Shared/.bundle/prod/${bundle.name}
run_as:
# This runs as someone@example.com in production. Alternatively,
# a service principal could be used here using service_principal_name
user_name: someone@example.com
Next steps
Browse additional templates that are created and maintained by Databricks. See the bundle samples repository in GitHub.
To use MLOps Stacks with Databricks Asset Bundle templates, see Databricks Asset Bundles for MLOps Stacks.
Learn more about Go package templating. See the Go package template documentation.