Databricks Terraform provider

HashiCorp Terraform is a popular open source tool for creating safe and predictable cloud infrastructure across several cloud providers. You can use the Databricks Terraform provider to manage your Databricks workspaces and the associated cloud infrastructure using a flexible, powerful tool. The goal of the Databricks Terraform provider is to support all Databricks REST APIs, supporting automation of the most complicated aspects of deploying and managing your data platforms. Databricks customers are using the Databricks Terraform provider to deploy and manage clusters and jobs, provision Databricks workspaces, and configure data access.


The Databricks Terraform provider is not formally supported by Databricks or Google. It is maintained by Databricks field engineering teams and provided as is. There is no service level agreement (SLA). Databricks and Google make no guarantees of any kind. If you discover an issue with the provider, file a GitHub Issue, and it will be reviewed by project maintainers as time permits.

Getting started

Complete the following steps to install and configure the tools that Terraform needs to operate. These tools include the Databricks CLI, the Terraform CLI, and the Google Cloud SDK. After setting up these tools, complete the steps to create a base Terraform configuration that you can use later to manage your Databricks workspaces and the associated Google Cloud infrastructure.


This procedure assumes that you have access to a deployed Databricks workspace as a Databricks admin and access to the corresponding Google Cloud project with the appropriate permissions for Terraform to perform in that Google Cloud project. For more information, see the following:

  1. Create a Databricks personal access token to allow Terraform to call the Databricks APIs within the Databricks account. For details, see Authentication using Databricks personal access tokens.

  2. Install the Databricks command-line interface (CLI), and then configure the Databricks CLI with your Databricks personal access token by running the databricks configure --token --profile <profile name> option to create a connection profile for this Databricks personal access token. Replace <profile name> with a unique name for this connection profile. For details, see the “Set up authentication” and “Connection profiles” sections in Databricks CLI.

    databricks configure --token --profile <profile name>


    Each Databricks personal access token is associated with a specific user in a Databricks account. Run the databricks configure --token --profile <profile name> command (replacing <profile name> with a unique name) for each Databricks personal access token that you want to make available for Terraform to use.

  3. Install the Terraform CLI. For details, see Download Terraform on the Terraform website.

  4. Install the Google Cloud SDK, and then configure the Google Cloud SDK by running the gcloud auth application-default login --project <project ID> command. For details, see Installing Google Cloud SDK and Initializing Cloud SDK on the Google Cloud website.

    gcloud auth application-default login --project <project ID>


    To have Terraform run within the context of a different project, run the gcloud auth application-default login --project <project ID> command again.

    This procedure uses the Google Cloud SDK CLI along with your own user credentials, stored on your local system, to authenticate. For alternative authentication options, see Authentication on the Terraform website.

  5. In your terminal, create an empty directory and then switch to it. (Each separate set of Terraform configuration files must be in its own directory.) For example: mkdir terraform_demo && cd terraform_demo.

    mkdir terraform_demo && cd terraform_demo
  6. In this empty directory, create a file named Add the following content to this file, and then save the file:

    variable "databricks_connection_profile" {
      description = "The name of the Databricks connection profile to use."
      type = string
      default = "<Databricks connection profile name>"
    terraform {
      required_providers {
        databricks = {
          source = "databrickslabs/databricks"
        google = {
          source = "hashicorp/google"
          version = "3.5.0"
    provider "databricks" {
      profile = var.databricks_connection_profile
    provider "google" {}
  7. Replace <Databricks connection profile name> with the name of the connection profile that you created earlier in step 2, and then save the file.

  8. Initialize the working directory containing the file by running the terraform init command. For more information, see Command: init on the Terraform website.

    terraform init

    Terraform downloads the databricks and google providers and installs them in a hidden subdirectory of your current working directory, named .terraform. The terraform init command prints out which version of the providers were installed. Terraform also creates a lock file named .terraform.lock.hcl which specifies the exact provider versions used, so that you can control when you want to update the providers used for your project.

  9. Apply the changes required to reach the desired state of the configuration by running the terraform apply command. For more information, see Command: apply on the Terraform website.

    terraform apply

    Because no resources have yet been specified in the file, the output is Apply complete! Resources: 0 added, 0 changed, 0 destroyed. Also, Terraform writes data into a file called terraform.tfstate. To create resources, continue with Sample configuration, Next steps, or both to specify the desired resources to create, and then run the terraform apply command again. Terraform stores the IDs and properties of the resources it manages in this terraform.tfstate file, so that it can update or destroy those resources going forward.

Sample configuration

Complete the following procedure to create a sample Terraform configuration that creates a notebook and a job to run that notebook, in an existing Databricks workspace.


The following sample Terraform configuration interacts only with an existing Databricks workspace. Because of this, to run this sample you do not need to configure the Google Cloud SDK or include the google provider in your file.

  1. At the end of the file that you created in Getting started, add the following code:

    variable "resource_prefix" {
      description = "The prefix to use when naming the notebook and job"
      type = string
      default = "terraform-demo"
    variable "email_notifier" {
      description = "The email address to send job status to"
      type = list(string)
      default = ["<Your email address>"]
    // Get information about the Databricks user that is calling
    // the Databricks API (the one associated with "databricks_connection_profile").
    data "databricks_current_user" "me" {}
    // Create a simple, sample notebook. Store it in a subfolder within
    // the Databricks current user's folder. The notebook contains the
    // following basic Spark code in Python.
    resource "databricks_notebook" "this" {
      path     = "${}/Terraform/${var.resource_prefix}-notebook.ipynb"
      language = "PYTHON"
      content_base64 = base64encode(<<-EOT
        # created from ${abspath(path.module)}
    // Create a job to run the sample notebook. The job will create
    // a cluster to run on. The cluster will use the smallest available
    // node type and run the latest version of Spark.
    // Get the smallest available node type to use for the cluster. Choose
    // only from among available node types with local storage.
    data "databricks_node_type" "smallest" {
      local_disk = true
    // Get the latest Spark version to use for the cluster.
    data "databricks_spark_version" "latest" {}
    // Create the job, emailing notifiers about job success or failure.
    resource "databricks_job" "this" {
      name = "${var.resource_prefix}-job-${}"
      new_cluster {
        num_workers   = 1
        spark_version =
        node_type_id  =
      notebook_task {
        notebook_path = databricks_notebook.this.path
      email_notifications {
        on_success = var.email_notifier
        on_failure = var.email_notifier
    // Print the URL to the notebook.
    output "notebook_url" {
      value = databricks_notebook.this.url
    // Print the URL to the job.
    output "job_url" {
      value = databricks_job.this.url
  2. Replace <Your email address> with your email address, and then save the file.

  3. Run terraform apply.

  4. Verify that the notebook and job were created: in the output of the terraform apply command, find the URLs for notebook_url and job_url and go to them.

  5. Run the job: on the Jobs page, click Run Now. After the job finishes, check your email inbox.

  6. When you are done with this sample, delete the notebook and job from the Databricks workspace by running terraform destroy.

  7. Verify that the notebook and job were deleted: refresh the notebook and Jobs pages to display a message that the reources cannot be found.

Next steps

Manage workspace resources for a Databricks workspace.


For Terraform-specific support, see the Latest Terraform topics on the HashiCorp Discuss website. For issues specific to the Databricks Terraform Provider, see Issues in the databrickslabs/terraform-provider-databricks GitHub repository.

Additional resources