Authentication for the Databricks CLI

Note

This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v.

This article describes how to set up authentication between the Databricks CLI and your Databricks accounts and workspaces. See What is the Databricks CLI?.

This article assumes that you have already installed the Databricks CLI. See Install or update the Databricks CLI.

Before you can run Databricks CLI commands, you must set up authentication between the Databricks CLI and your Databricks accounts, workspaces, or a combination of these, depending on the types of CLI commands that you want to run.

You must authenticate the Databricks CLI to the relevant resources at run time in order to run Databricks automation commands within a Databricks account or workspace. Depending on whether you want to call Databricks workspace-level commands, Databricks account-level commands, or both, you must authenticate to the Databricks workspace, account, or both. For a list of Databricks workspace-level and account-level CLI command groups, run the command databricks -h. For a list of Databricks workspace-level and account-level REST API operations that the Databricks CLI commands cover, see the Databricks REST API.

Note

The Databricks CLI implements the Databricks client unified authentication standard, a consolidated and consistent architecural and programmatic approach to authentication. This approach helps make setting up and automating authentication with Databricks more centralized and predictable. It enables you to configure Databricks authentication once and then use that configuration across multiple Databricks tools and SDKs without further authentication configuration changes. For more information about this standard, see Databricks client unified authentication.

The following sections provide information about how to set up authentication between the Databricks CLI and Databricks:

Databricks personal access token authentication

Databricks personal access token authentication uses a Databricks personal access token to authenticate the target Databricks entity, such as a Databricks user account or a Databricks service principal. See also Databricks personal access token authentication.

Note

You cannot use Databricks personal access token authentication for authenticating with a Databricks account, as Databricks account-level commands do not use Databricks personal access tokens for authentication. To authenticate with a Databricks account, consider using one of the following authentication types instead:

To create a personal access token, do the following:

  1. In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.

  2. Click Developer.

  3. Next to Access tokens, click Manage.

  4. Click Generate new token.

  5. (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).

  6. Click Generate.

  7. Copy the displayed token to a secure location, and then click Done.

Note

Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.

If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following:

To configure and use Databricks personal access token authentication, do the following:

Note

The following procedure creates a Databricks configuration profile with the name DEFAULT. If you already have a DEFAULT configuration profile that you want to use, then skip this procedure. Otherwise, this procedure overwrites your existing DEFAULT configuration profile. To view the names and hosts of any existing configuration profiles, run the command databricks auth profiles.

To create a configuration profile with a name other than DEFAULT, add --profile <configuration-profile-name> or -p <configuration-profile-name> to the end of the following databricks configure command, replacing <configuration-profile-name> with the new configuration profile’s name.

  1. Use the Databricks CLI to run the following command:

    databricks configure
    
  2. For the prompt Databricks Host, enter your Databricks workspace instance URL, for example https://1234567890123456.7.gcp.databricks.com.

  3. For the prompt Personal Access Token, enter the Databricks personal access token for your workspace.

    After you enter your Databricks personal access token, a corresponding configuration profile is added to your .databrickscfg file. If the Databricks CLI cannot find this file in its default location, it creates this file for you first and then adds this configuration profile to the new file. The default location for this file is in your ~ (your user home) folder on Unix, Linux, or macOS, or your %USERPROFILE% (your user home) folder on Windows.

  4. You can now use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

OAuth machine-to-machine (M2M) authentication

Instead of authenticating with Databricks by using Databricks personal access token authentication, you can use OAuth authentication. OAuth provides tokens with faster expiration times than Databricks personal access tokens, and offers better server-side session invalidation and scoping. Because OAuth access tokens expire in less than an hour, this reduces the risk associated with accidentally checking tokens into source control. See also OAuth machine-to-machine (M2M) authentication.

To configure and use OAuth M2M authentication, do the following:

  1. Complete the OAuth M2M authentication setup instructions. See OAuth machine-to-machine (M2M) authentication

  2. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

    For account-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host          = <account-console-url>
    account_id    = <account-id>
    client_id     = <service-principal-client-id>
    client_secret = <service-principal-oauth-secret>
    

    For workspace-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host          = <workspace-url>
    client_id     = <service-principal-client-id>
    client_secret = <service-principal-oauth-secret>
    

    Note

    The default location for the .databrickscfg file is in the user’s home directory. This is ~ for Linux and macOS, and %USERPROFILE% for Windows.

  3. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile as part of the Databricks CLI command call, for example, databricks account groups list -p <configuration-profile-name> or databricks clusters list -p <configuration-profile-name>.

    Tip

    Press Tab after --profile or -p to display a list of existing available configuration profiles to choose from, instead of entering the configuration profile name manually.

OAuth user-to-machine (U2M) authentication

Instead of authenticating with Databricks by using token authentication, you can use OAuth authentication. OAuth provides tokens with faster expiration times than Databricks personal access tokens, and offers better server-side session invalidation and scoping. Because OAuth access tokens expire in less than an hour, this reduces the risk associated with accidentally checking tokens into source control. See also OAuth user-to-machine (U2M) authentication.

To configure and use OAuth U2M authentication, do the following:

  1. Before calling any Databricks account-level commands, you must initiate OAuth token management locally by running the following command. This command must be run separately for each account that you want to run commands against. If you do not want to call any account-level operations, skip ahead to step 5.

    In the following command, replace the following placeholders:

    databricks auth login --host <account-console-url> --account-id <account-id>
    
  2. The Databricks CLI prompts you to save the account console URL and account ID locally as a Databricks configuration profile. Press Enter to accept the suggested profile name, or enter the name of a new or existing profile. Any existing profile with the same name is overwritten with this account console URL and account ID.

    To get a list of any existing profiles, in a separate terminal or command prompt, run the command databricks auth profiles. To view a specific profile’s existing settings, run the command databricks auth env --profile <profile-name>.

  3. In your web browser, complete the on-screen instructions to log in to your Databricks account.

  4. To view the current OAuth token value and upcoming expiration timestamp, run the command databricks auth token --host <account-console-url> --account-id <account-id>.

  5. Before calling any Databricks workspace-level commands, you must initiate OAuth token management locally by running the following command. This command must be run separately for each workspace that you want to run commands against.

    In the following command, replace <workspace-url> with your Databricks workspace instance URL, for example https://1234567890123456.7.gcp.databricks.com.

    databricks auth login --host <workspace-url>
    
  6. The Databricks CLI prompts you to save the workspace URL locally as a Databricks configuration profile. Press Enter to accept the suggested profile name, or enter the name of a new or existing profile. Any existing profile with the same name is overwritten with this workspace URL.

    To get a list of any existing profiles, in a separate terminal or command prompt, run the command databricks auth profiles. To view a specific profile’s existing settings, run the command databricks auth env --profile <profile-name>.

  7. In your web browser, complete the on-screen instructions to log in to your Databricks workspace.

  8. To view the current OAuth token value and upcoming expiration timestamp, run the command databricks auth token --host <workspace-url>.

  9. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks account groups list -p <configuration-profile-name> or databricks clusters list -p <configuration-profile-name>.

    Tip

    You can press Tab after --profile or -p to display a list of existing available configuration profiles to choose from, instead of entering the configuration profile name manually.

Google Cloud credentials authentication

Google Cloud credentials authentication uses Google Cloud service account credentials to authenticate the target Google Cloud service account. See also Google Cloud credentials authentication.

To configure Google Cloud credentials authentication, you must have the Google Cloud CLI installed locally. You must also do the following:

  1. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

    For account-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host               = <account-console-url>
    account_id         = <account-id>
    google_credentials = <path-to-google-service-account-credentials-file>
    

    For workspace-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host               = <workspace-url>
    google_credentials = <path-to-google-service-account-credentials-file>
    

    Note

    The default location for the .databrickscfg file is in the user’s home directory. This is ~ for Linux and macOS, and %USERPROFILE% for Windows.

  2. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks account groups list -p <configuration-profile-name> or databricks clusters list -p <configuration-profile-name>.

    Tip

    You can press Tab after --profile or -p to display a list of existing available configuration profiles to choose from, instead of entering the configuration profile name manually.

Google Cloud ID authentication

Google Cloud ID authentication authenticates the target Google Cloud service account. See Google Cloud ID authentication.

To configure Google Cloud ID authentication, you must have the Google Cloud CLI installed locally. You must also do the following:

  1. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

    For account-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host                   = <account-console-url>
    account_id             = <account-id>
    google_service_account = <google-cloud-service-account-email-address>
    

    For workspace-level commands, set the following values in your .databrickscfg file:

    [<some-unique-configuration-profile-name>]
    host                   = <workspace-url>
    google_service_account = <google-cloud-service-account-email-address>
    

    Note

    The default location for the .databrickscfg file is in the user’s home directory. This is ~ for Linux and macOS, and %USERPROFILE% for Windows.

  2. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

    Tip

    You can press Tab after --profile or -p to display a list of existing available configuration profiles to choose from, instead of entering the configuration profile name manually.

Authentication order of evaluation

Whenever the Databricks CLI needs to gather the settings that are required to attempt to authenticate with a Databricks workspace or account, it searches for these settings in the following locations, in the following order.

  1. For bundle commands, the values of fields within a project’s bundle setting files. (Bundle setting files do not support the direct inclusion of access credential values.)

  2. The values of environment variables, as listed within this article and in Environment variables and fields for client unified authentication.

  3. Configuration profile field values within the .databrickscfg file, as listed previously within this article.

Whenever the Databricks CLI finds the required settings that it needs, it stops searching in other locations. For example:

  • The Databricks CLI needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is set, and the .databrickscfg file also contains multiple personal access tokens. In this example, the Databricks CLI uses the value of the DATABRICKS_TOKEN environment variable and does not search the .databrickscfg file.

  • The databricks bundle deploy -e development command needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is not set, and the .databrickscfg file contains multiple personal access tokens. The project’s bundle settings file contains a development environment declaration that references through its profile field a configuration profile named DEV. In this example, the Databricks CLI searches the .databrickscfg file for a profile named DEV and uses the value of that profile’s token field.

  • The databricks bundle run -e development hello-job command needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is not set, and the .databrickscfg file contains multiple personal access tokens. The project’s bundle settings file contains a development environment declaration that references through its host field a specific Databricks workspace URL. In this example, the Databricks CLI searches through the configuration profiles within the .databrickscfg file for a profile that contains a host field with a matching workspace URL. The Databricks CLI finds a matching host field and then uses that profile’s token field value.