Authenticate access to Databricks resources

To access a Databricks resource with the Databricks CLI or REST APIs, clients must authenticate using a Databricks account with the required authorization to access the resource. To securely run a Databricks CLI command or call a Databricks API request that requires authorized access to an account or workspace, you must provide an access token based on valid Databricks account credentials. This article covers the authentication options to provide those credentials and authorize access to a Databricks workspace or account.

The following table shows the authentication methods available to your Databricks account.

Databricks authentication methods

Because Databricks tools and SDKs work with one or more supported Databricks authentication methods, you can select the best authentication method for your use case. For details, see the tool or SDK documentation in Developer tools.

Method

Description

Use case

OAuth for service principals (OAuth M2M)

Short-lived OAuth tokens for service principals.

Unattended authentication scenarios, such as fully automated and CI/CD workflows.

OAuth for users (OAuth U2M)

Short-lived OAuth tokens for users.

Attended authentication scenarios, where you use your web browser to authenticate with Databricks in real time, when prompted.

Personal access tokens (PAT)

Short-lived or long-lived tokens for users or service principals.

Scenarios where your target tool does not support OAuth.

Google Cloud Platform credentials authentication

Uses Google Cloud service accounts, acting as Databricks users, with Google Cloud OAuth tokens.

Use to authenticate to Google Cloud resources and Databricks.

Google Cloud Platform ID authentication

Uses Google Cloud service accounts, acting as Databricks users, with Google Cloud OAuth tokens.

Use to authenticate to Google Cloud resources and Databricks using the Google Cloud CLI.

What authentication approach should I choose?

You have two options to authenticate a Databricks CLI command or API call for access to your Databricks resources:

  • Use a Databricks user account (called “user-to-machine” authentication, or U2M). Choose this only when you are running a Databricks CLI command from your local client environment or calling a Databricks API request from code you own and run exclusively.

  • Use a Databricks service principal (called “machine-to-machine” authentication, or M2M). Choose this if others will be running your code (especially in the case of an app), or if you are building automation that will call Databricks CLI commands or API requests.

You must also have an access token linked to the account you will use to call the Databricks API. This token can be either an OAuth 2.0 access token or a personal access token (PAT). However, Databricks strongly recommends you use OAuth over PATs for authorization as OAuth tokens are automatically refreshed by default and do not require the direct management of the access token, improving your security against token hijacking and unwanted access. Because OAuth creates and manages the access token for you, you provide an OAuth token endpoint URL, a client ID, and a secret you generate from your Databricks workspace instead of directly providing a token string yourself. PATs expose the risk of long-lived tokens providing egress opportunities if they are not regularly audited and rotated or revoked, or if the token strings and passwords are not securely managed for your development environment.

How do I use OAuth to authenticate with Databricks?

Databricks provides unified client authentication to assist you with authentication by using a default set of environment variables you can set to specific credential values. This helps you work more easily and securely since these environment variables are specific to the environment that will be running the Databricks CLI commands or calling Databricks APIs.

  • For user account (user-to-machine) authentication, Databricks OAuth is handled for you with Databricks client unified authentication, as long as the tools and SDKs implement its standard. If they don’t, you can manually generate an OAuth code verifier and challenge pair to use directly in your Databricks CLI commands and API requests. See Step 1: Generate an OAuth code verifier and code challenge pair.

  • For service principal (machine-to-machine) authentication, Databricks OAuth requires that the caller provide client credentials along with a token endpoint URL where the request can be authorized. (This is handled for you if you use Databricks tools and SDKs that support Databricks unified client authentication.) The credentials include a unique client ID and client secret. The client, which is the Databricks service principal that will run your code, must be assigned to Databricks workspaces. After you assign the service principal to the workspaces it will access, you are provided with a client ID and a client secret that you will set with specific environment variables.

These environment variables are:

  • DATABRICKS_HOST: This environment variable is set to the URL of either your Databricks account console (http://accounts.cloud.databricks.com) or your Databricks workspace URL (https://{workspace-id}.cloud.databricks.com). Choose a host URL type based on the type of operations you will be performing in your code. Specifically, if you are using Databricks account-level CLI commands or REST API requests, set this variable to your Databricks account URL. If you are using Databricks workspace-level CLI commands or REST API requests, use your Databricks workspace URL.

  • DATABRICKS_ACCOUNT_ID: Used for Databricks account operations. This is your Databricks account ID. To get it, see Locate your account ID.

  • DATABRICKS_CLIENT_ID: (M2M OAuth only) The client ID you were assigned when creating your service principal.

  • DATABRICKS_CLIENT_SECRET: (M2M OAuth only) The client secret you generated when creating your service principal.

You can set these directly, or through the use of a Databricks configuration profile (.databrickscfg) on your client machine.

To use an OAuth access token, your Databricks workspace or account administrator must have granted your user account or service principal the CAN USE privilege for the account and workspace features your code will access.

For more details on configuring OAuth authorization for your client and to review cloud provider-specific authorization options, see Unified client authentication.

Authentication for third-party services and tools

If you are writing code which accesses third-party services, tools, or SDKs you must use the authentication and authorization mechanisms provided by the third-party. However, if you must grant a third-party tool, SDK, or service access to your Databricks account or workspace resources, Databricks provides the following support:

Databricks configuration profiles

A Databricks configuration profile contains settings and other information that Databricks needs to authenticate. Databricks configuration profiles are stored in local client files for your tools, SDKs, scripts, and apps to use. The standard configuration profile file is named .databrickscfg. For more information, see Databricks configuration profiles.