Authentication for Databricks Asset Bundles

This article describes how to configure authentication for Databricks Asset Bundles. See What are Databricks Asset Bundles?.

You deploy and run Databricks Asset Bundles within the context of two types of authentication scenarios: attended and unattended:

  • Attended authentication scenarios are manual workflows, for example, using your web browser on your local machine to log in to your target Databricks workspace when prompted by the Databricks CLI.

  • Unattended authentication scenarios are automated and CI/CD workflows, for example when using CI/CD systems such as GitHub.

The following sections recommend the Databricks authentication types and settings to use for Databricks Asset Bundles, based on these two types of authentication scenarios.

Attended authentication

For attended authentication scenarios with Databricks Asset Bundles, Databricks recommends that you use OAuth user-to-machine (U2M) authentication for your Databricks user account in the target workspace.

You can also use a personal access token associated with your Databricks user account for the target workspace.

For more information about these Databricks authentication types, see Databricks authentication methods.

For storing authentication settings for attended authentication scenarios, Databricks recommends that you use Databricks configuration profiles on your local development machine. Configuration profiles enable you to quickly switch among different Databricks authentication contexts to do rapid local development among multiple Databricks workspaces. With profiles, you can use the --profile or -p options to specify a particular profile when running the bundle validate, deploy, run, and destroy commands with the Databricks CLI. See Databricks configuration profiles.

Databricks also supports the use of the profile mapping within the workspace mapping to specify the profile to use for each target workspace in your bundle configuration files. However, hard-coded mappings make your bundle configuration files less reusable across projects.

Unattended authentication

For unattended authentication scenarios with Databricks Asset Bundles, Databricks recommends that you use the following Databricks authentication types, in the following order of preference:

For more information about these Databricks authentication types, see Databricks authentication methods.

For unattended authentication scenarios, Databricks recommends using environment variables to store Databricks authentication settings in your target CI/CD system, because CI/CD systems are typically optimized for this.

For Databricks Asset Bundles projects used in CI/CD systems designed to work with multiple Databricks workspaces (for example, three separate but related development, staging, and production workspaces), Databricks recommends that you use service principals for authentication and that you give one service principal access to all participating workspaces. This enables you to use the same environment variables across all of the project’s workspaces.

Databricks also supports the use of hard-coded, authentication-related settings in the workspace mapping for target workspaces in your bundle configuration files. Hard-coded settings make your bundles configuration less reusable across projects and risk unnecessarily exposing sensitive information such as Databricks service principal IDs.

For unattended authentication scenarios, you must also install the Databricks CLI on the associated compute resources, as follows:

OAuth machine-to-machine (M2M) authentication

To set up OAuth M2M authentication, see Authenticate access to Databricks with a service principal using OAuth (OAuth M2M).

The list of environment variables to set for unattended authentication is in the workspace-level operations coverage of the “Environment” section of Authenticate access to Databricks with a service principal using OAuth (OAuth M2M). To set environment variables, see the documentation for your operating system or CI/CD system provider.

OAuth user-to-machine (U2M) authentication

To set up OAuth U2M authentication, see the “CLI” section in Authenticate access to Databricks with a user account using OAuth (OAuth U2M).

For attended authentication scenarios, completing the instructions in the “CLI” section of Authenticate access to Databricks with a user account using OAuth (OAuth U2M) automatically creates a Databricks configuration profile for you.

Google Cloud ID authentication

To set up Google Cloud ID authentication, see Google Cloud ID authentication.

The list of environment variables to set for unattended authentication is in the workspace-level operations coverage in the “Environment” section of Google Cloud ID authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.

Databricks personal access token authentication

To create a Databricks personal access token, see Databricks personal access token authentication.

For attended authentication scenarios, to create a Databricks configuration profile, see the “CLI” section in Databricks personal access token authentication.

The list of environment variables to set for unattended authentication is in the workspace-level operations coverage in the “Environment” section of Databricks personal access token authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.