Provision a service principal for Databricks automation - Terraform

Note

To provision a Databricks service principal by using the Databricks user interface instead, see Manage service principals..

A service principal is an identity for automated tools and systems like scripts, apps, and CI/CD platforms. Databricks recommends using a service principal and its OAuth token or personal access token instead of your Databricks user account and personal access token. Benefits include:

  • Granting and restricting access to resources independently of a user.

  • Enabling users to better protect their own access tokens.

  • Disabling or deleting a service principal without affecting other users.

  • Removing a user when they leave the organization without impacting any service principal.

Follow these instructions to use the Databricks Terraform provider to create a Databricks service principal in your Databricks workspace and then create a Databricks personal access token for the Databricks service principal.

Requirements

  • A Databricks personal access token to allow the Databricks Terraform provider to call the Databricks APIs on behalf of your Databricks user account within the Databricks workspace. To create a personal access token, do the following:

    1. In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.

    2. Click Developer.

    3. Next to Access tokens, click Manage.

    4. Click Generate new token.

    5. (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).

    6. Click Generate.

    7. Copy the displayed token to a secure location, and then click Done.

    Note

    Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.

    If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following:

  • Databricks CLI version 0.205 or above, configured with a Databricks authentication configuration profile that references the corresponding Databricks personal access token. To create this configuration profile, do the following:

    Note

    The following procedure uses the Databricks CLI to create a Databricks configuration profile with the name DEFAULT. If you already have a DEFAULT configuration profile, this procedure overwrites your existing DEFAULT configuration profile.

    To check whether you already have a DEFAULT configuration profile, and to view this profile’s settings if it exists, use the Databricks CLI to run the command databricks auth env --profile DEFAULT.

    To create a configuration profile with a name other than DEFAULT, replace the DEFAULT part of --profile DEFAULT in the following databricks configure command with a different name for the configuration profile.

    1. Use the Databricks CLI to create a Databricks configuration profile named DEFAULT that uses Databricks personal access token authentication. To do this, run the following command:

      databricks configure --profile DEFAULT
      
    2. For the prompt Databricks Host, enter your Databricks workspace instance URL, for example https://1234567890123456.7.gcp.databricks.com.

    3. For the prompt Personal Access Token, enter the Databricks personal access token for your workspace.

  • The Terraform CLI. See Download Terraform.

Create the Databricks service principal and Databricks personal access token

  1. In your terminal, create an empty directory and then switch to it. Each separate set of Terraform configuration files must be in its own directory. For example: mkdir terraform_service_principal_demo && cd terraform_service_principal_demo.

    mkdir terraform_service_principal_demo && cd terraform_service_principal_demo
    
  2. In this empty directory, create a file named main.tf. Add the following content to this file, and then save the file.

    Warning

    The following content contains the statement authorization = "tokens". There can be only one authorization = "tokens" permissions resource per Databricks workspace. After applying the following changes, users who previously had either CAN_USE or CAN_MANAGE permission will have their access to token-based authentication revoked. Their active tokens are also immediately deleted (revoked). Because of the potentially disruptive nature of this operation, the related configuration is commented out in the main.tf file.

    Alternatively, you can use the Databricks user interface to enable the Databricks service principal to use a Databricks personal access token. See Manage access to Databricks automation. You can also use the Databricks user interface to generate a Databricks personal access token for the Databricks service principal. See Manage tokens for a service principal.

    Note

    The following content creates a service principal at the Databricks workspace level. If your Databricks workspace is enabled for identity federation, then the following content also automatically synchronizes the service principal to the related Databricks account (see How do admins assign users to workspaces?). To create a service principal at the Databricks account level only instead of at the workspace level, see the “Creating service principal in AWS Databricks account” section of databricks_service_principal Resource in the Databricks Terraform provider documentation.

    If you choose to uncomment the following resources and output, a personal access token is also generated. This personal access token can be used by the service principal for automation only within the specified Databricks workspace.

    You cannot use personal access tokens with service principals for Databricks account-level automation. If you attempt to generate a personal access token for a service principal at the Databricks account level, the attempt will fail.

    variable "databricks_connection_profile" {
      description = "The name of the Databricks authentication configuration profile to use."
      type        = string
    }
    
    variable "service_principal_display_name" {
      description = "The display name for the service principal."
      type        = string
    }
    
    variable "service_principal_access_token_lifetime" {
      description = "The lifetime of the service principal's access token, in seconds."
      type        = number
      default     = 3600
    }
    
    terraform {
      required_providers {
        databricks = {
          source = "databricks/databricks"
        }
      }
    }
    
    provider "databricks" {
      profile = var.databricks_connection_profile
    }
    
    resource "databricks_service_principal" "sp" {
      provider     = databricks
      display_name = var.service_principal_display_name
    }
    
    # Uncomment the following "databricks_permissions" resource
    # if you want to enable the service principal to use
    # personal access tokens.
    #
    # Warning: uncommenting the following "databricks_permissions" resource
    # causes users who previously had either CAN_USE or CAN_MANAGE permission
    # to have their access to token-based authentication revoked.
    # Their active tokens are also immediately deleted (revoked).
    #
    # Alternatively, you can enable this later through the Databricks user interface.
    #
    # resource "databricks_permissions" "token_usage" {
    #   authorization    = "tokens"
    #   access_control {
    #     service_principal_name = databricks_service_principal.sp.application_id
    #     permission_level       = "CAN_USE"
    #   }
    # }
    #
    # Uncomment the following "databricks_obo_token" resource and
    # "service_principal_access_token" output if you want to generate
    # a personal access token for service principal and then see the
    # generated personal access token.
    #
    # If you uncomment the following "databricks_obo_token" resource and
    # "service_principal_access_token" output, you must also
    # uncomment the preceding "databricks_permissions" resource.
    #
    # Alternatively, you can generate a personal access token later through the
    # Databricks user interface.
    #
    # resource "databricks_obo_token" "this" {
    #   depends_on       = [databricks_permissions.token_usage]
    #   application_id   = databricks_service_principal.sp.application_id
    #   comment          = "Personal access token on behalf of ${databricks_service_principal.sp.display_name}"
    #   lifetime_seconds = var.service_principal_access_token_lifetime
    # }
    
    output "service_principal_name" {
      value = databricks_service_principal.sp.display_name
    }
    
    output "service_principal_id" {
      value = databricks_service_principal.sp.application_id
    }
    
    # Uncomment the following "service_principal_access_token" output if
    # you want to see the generated personal access token for the service principal.
    #
    # If you uncomment the following "service_principal_access_token" output, you must
    # also uncomment the preceding "service_principal_access_token" resource and
    # "databricks_obo_token" resource.
    #
    # output "service_principal_access_token" {
    #   value     = databricks_obo_token.this.token_value
    #   sensitive = true
    # }
    

    Note

    To add this service principal to Databricks workspace groups, and to add Databricks workspace entitlements to this service principal, see databricks_service_principal on the Terraform website.

  3. In the same directory, create a file named terraform.tfvars. Add the following content to this file, replacing the following values, and then save the file:

    • Replace the databricks_connection_profile value with the name of your authentication configuration profile from the requirements.

    • Replace the service_principal_display_name value with a display name for the service principal.

    • Replace the service_principal_access_token_lifetime value with the number of seconds for the lifetime of the access token for the service principal.

    databricks_connection_profile           = "<Databricks authentication configuration profile name>"
    service_principal_display_name          = "<Service principal display name>"
    service_principal_access_token_lifetime = 3600
    
  4. Initialize the working directory containing the main.tf file by running the terraform init command. For more information, see Command: init on the Terraform website.

    terraform init
    
  5. Check whether there are any syntax errors in the configuration by running the terraform validate command. For more information, see Command: validate on the Terraform website.

    terraform validate
    
  6. Apply the changes required to reach the desired state of the configuration by running the terraform apply command. For more information, see Command: apply on the Terraform website.

    terraform apply
    
  7. If you uncommented the databricks_permissions resource, the databricks_obo_token resource, and the service_principal_access_token output, then to get the service principal’s access token, see the value of outputs.service_principal_access_token.value in the terraform.tfstate file, which is in the working directory containing the main.tf file.