Authenticate access to Databricks with a user account using OAuth (OAuth U2M)
Databricks uses OAuth user-to-machine (U2M) authentication to enable CLI and API access to Databricks account and workspace resources on behalf of a user. After a user initially signs in and consents to the OAuth authentication request, an OAuth token is given to the participating tool or SDK to perform token-based authentication on the user’s behalf from that time forward. The OAuth token has a lifespan of one hour, following which the tool or SDK involved will make an automatic background attempt to obtain a new token that is also valid for one hour.
Databricks supports two ways to authenticate access for a user account with OAuth:
Mostly automatically, using the Databricks unified client authentication support. Use this simplified approach if you are using specific Databricks SDKs (such as the Databricks Terraform SDK) and tools. Supported tools and SDKs are listed in Databricks unified client authentication.
Manually, by directly generating an OAuth code verifier/challenge pair and an authorization code, and using them to create the initial OAuth token you will provide in your configuration. Use this approach when you are not using an API supported by Databricks unified client authentication. For more details see: Manually generate and use access tokens for OAuth user-to-machine (U2M) authentication.
U2M authentication with Databricks unified client authentication
Note
Before you start configuring your authentication, review the ACL permissions for a specific category of operations on workspace objects and determine whether your account has the access level you require. For more details, see Access control lists.
To perform OAuth U2M authentication with Databricks SDKs and tools that support unified client authentication, integrate the following within your code:
To use environment variables for a specific Databricks authentication type with a tool or SDK, see Authenticate access to Databricks resources or the tool’s or SDK’s documentation. See also Environment variables and fields for client unified authentication and the Default methods for client unified authentication.
For account-level operations, set the following environment variables:
DATABRICKS_HOST
, set to the value of your Databricks account console URL,https://accounts.gcp.databricks.com
.DATABRICKS_ACCOUNT_ID
For workspace-level operations, set the following environment variables:
DATABRICKS_HOST
, set to the value of your Databricks workspace URL, for examplehttps://1234567890123456.7.gcp.databricks.com
.
Create or identify a Databricks configuration profile with the following fields in your .databrickscfg
file. If you create the profile, replace the placeholders with the appropriate values. To use the profile with a tool or SDK, see Authenticate access to Databricks resources or the tool’s or SDK’s documentation. See also Environment variables and fields for client unified authentication and the Default methods for client unified authentication.
For account-level operations, set the following values in your .databrickscfg
file. In this case, the Databricks account console URL is https://accounts.gcp.databricks.com
:
[<some-unique-configuration-profile-name>]
host = <account-console-url>
account_id = <account-id>
For workspace-level operations, set the following values in your .databrickscfg
file. In this case, the host is the Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
:
[<some-unique-configuration-profile-name>]
host = <workspace-url>
For the Databricks CLI, run the databricks auth login
command with the following options:
For Databricks account-level operations,
--host <account-console-url> --account-id <account-id>
.For Databricks workspace-level operations,
--host <workspace-url>
.
After you run this command, follow the instructions in your web browser to log in to your Databricks account or workspace.
For more details, see OAuth U2M authentication with the Databricks CLI.
Note
OAuth U2M authentication is supported in the following Databricks Connect versions:
For Python, Databricks Connect for Databricks Runtime 13.1 and above.
For Scala, Databricks Connect for Databricks Runtime 13.3 LTS and above.
For Databricks Connect, you can do one of the following:
Set the values in your
.databrickscfg
file for Databricks workspace-level operations as specified in this article’s “Profile” section. Also set thecluster_id
environment variable in your profile to your workspace instance URL, for examplehttps://1234567890123456.7.gcp.databricks.com
.Set the environment variables for Databricks workspace-level operations as specified in this article’s “Environment” section. Also set the
DATABRICKS_CLUSTER_ID
environment variable to your workspace instance URL, for examplehttps://1234567890123456.7.gcp.databricks.com
.
Values in your .databrickscfg
file always take precedence over environment variables.
To initialize the Databricks Connect client with these environment variables or values in your .databrickscfg
file, see one of the following:
For Python, see Configure connection properties for Python.
For Scala, see Configure connection properties for Scala.
For the Databricks extension for Visual Studio Code, do the following:
In the Configuration pane, click Configure Databricks.
In the Command Palette, for Databricks Host, enter your workspace URL, for example
https://1234567890123456.7.gcp.databricks.com
, and then pressEnter
.Select OAuth (user to machine).
Complete the on-screen instructions within your web browser to finish authenticating with your Databricks account and allowing all-apis access.
For more details, see OAuth U2M authentication with the Databricks CLI.
Note
OAuth U2M authentication is not yet supported.
For both account-level and workspace-level operations, you must use the Databricks CLI to run the following command before you run your Python code. This command instructs the Databricks CLI to generate and cache the necessary OAuth token in the path .databricks/token-cache.json
within your user’s home folder on your machine:
Configuring for Databricks account-level operations
databricks auth login --host <account-console-url> --account-id <account-id>
Replace the following placeholders:
Replace
<account-console-url>
with the valuehttps://accounts.gcp.databricks.com
. (Do not set this to the value of your Databricks workspace URL.)Replace
<account-id>
with the value of your Databricks account. See Locate your account ID.
Note
If you have an existing Databricks configuration profile with the host
and account_id
fields already set, you can substitute --host <account-console-url> --account-id <account-id>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the account login URL and account ID as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Python code similar to one of the following snippets:
from databricks.sdk import AccountClient
a = AccountClient()
# ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the Databricks account console URL is https://accounts.gcp.databricks.com
:
from databricks.sdk import AccountClient
a = AccountClient(
host = retrieveAccountConsoleUrl(),
account_id = retrieveAccountId()
)
# ...
Configuring for Databricks workspace-level operations
databricks auth login --host <worskpace-url>
Replace the placeholder <workspace-url>
with the target Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
.
Note
If you have an existing Databricks configuration profile with the host
field already set, you can substitute --host <workspace-url>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the workspace URL as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Python code similar to one of the following snippets:
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
# ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the host is the Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
:
from databricks.sdk import WorkspaceClient
w = WorkspaceClient(host = retrieveWorkspaceUrl())
# ...
For more information about authenticating with Databricks tools and SDKs that use Python and that implement Databricks client unified authentication, see:
For both account-level and workspace-level operations, you must use the Databricks CLI to run the following command before you run your Java code. This command instructs the Databricks CLI to generate and cache the necessary OAuth token in the path .databricks/token-cache.json
in your user’s home folder on your machine:
Configuring for Databricks account-level operations
databricks auth login --host <account-console-url> --account-id <account-id>
Replace the following placeholders:
Replace
<account-console-url>
with the valuehttps://accounts.gcp.databricks.com
. (Do not set this to the value of your Databricks workspace URL.)Replace
<account-id>
with the value of your Databricks account. See Locate your account ID.
Note
If you have an existing Databricks configuration profile with the host
and account_id
fields already set, you can substitute --host <account-console-url> --account-id <account-id>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the account login URL and account ID as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Java code similar to one of the following snippets:
import com.databricks.sdk.AccountClient;
// ...
AccountClient a = new AccountClient();
// ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the Databricks account console URL is https://accounts.gcp.databricks.com
:
import com.databricks.sdk.AccountClient;
import com.databricks.sdk.core.DatabricksConfig;
// ...
DatabricksConfig cfg = new DatabricksConfig()
.setHost(retrieveAccountConsoleUrl())
.setAccountId(retrieveAccountId());
AccountClient a = new AccountClient(cfg);
// ...
Configuring for Databricks workspace-level operations
For workspace-level operations, you should first use the Databricks CLI to run the following command, before you run your Java code. This command instructs the Databricks CLI to generate and cache the necessary OAuth token in the path .databricks/token-cache.json
within your user’s home folder on your machine:
databricks auth login --host <worskpace-url>
Replace the placeholder <workspace-url>
with the target Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
.
Note
If you have an existing Databricks configuration profile with the host
field already set, you can substitute --host <workspace-url>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the workspace URL as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Java code similar to one of the following snippets:
import com.databricks.sdk.WorkspaceClient;
// ...
WorkspaceClient w = new WorkspaceClient();
// ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the host is the Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
:
import com.databricks.sdk.WorkspaceClient;
import com.databricks.sdk.core.DatabricksConfig;
// ...
DatabricksConfig cfg = new DatabricksConfig()
.setHost(retrieveWorkspaceUrl())
WorkspaceClient w = new WorkspaceClient(cfg);
// ...
For more information about authenticating with Databricks tools and SDKs that use Java and that implement Databricks client unified authentication, see:
Set up the Databricks Connect client for Scala (the Databricks Connect client for Scala uses the included Databricks SDK for Java for authentication)
Authenticate the Databricks SDK for Java with your Databricks account or workspace
For both account-level and workspace-level operations, you must use the Databricks CLI to run the following command before you run your Go code. This command instructs the Databricks CLI to generate and cache the necessary OAuth token in the path .databricks/token-cache.json
within your user’s home folder on your machine:
Configuring for Databricks account-level operations
databricks auth login --host <account-login-url> --account-id <account-id>
Replace the following placeholders:
Replace
<account-console-url>
with the valuehttps://accounts.gcp.databricks.com
. (Do not set this to the value of your Databricks workspace URL.)Replace
<account-id>
with the value of your Databricks account. See Locate your account ID.
Note
If you have an existing Databricks configuration profile with the host
and account_id
fields already set, you can substitute --host <account-console-url> --account-id <account-id>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the account login URL and account ID as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Go code similar to one of the following snippets:
import (
"github.com/databricks/databricks-sdk-go"
)
// ...
a := databricks.Must(databricks.NewAccountClient())
// ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the Databricks account console URL is https://accounts.gcp.databricks.com
:
import (
"github.com/databricks/databricks-sdk-go"
)
// ...
a := databricks.Must(databricks.NewAccountClient(&databricks.Config{
Host: retrieveAccountConsoleUrl(),
AccountId: retrieveAccountId(),
}))
// ...
Configuring for Databricks workspace-level operations
For workspace-level operations, you should first use the Databricks CLI to run the following command, before you run your Go code. This command instructs the Databricks CLI to generate and cache the necessary OAuth token in the path .databricks/token-cache.json
within your user’s home folder on your machine:
databricks auth login --host <worskpace-url>
Replace the placeholder <workspace-url>
with the target Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
.
Note
If you have an existing Databricks configuration profile with the host
field already set, you can substitute --host <workspace-url>
with --profile <profile-name>
.
After you run the auth login
command, you are prompted to save the workspace URL as a Databricks configuration profile. When prompted, enter the name of a new or existing profile in your .databrickscfg
file. Any existing profile with the same name in your .databrickscfg
file is overwritten.
If prompted, complete your web browser’s on-screen instructions to complete the login. Then use Go code similar to one of the following snippets:
import (
"github.com/databricks/databricks-sdk-go"
)
// ...
w := databricks.Must(databricks.NewWorkspaceClient())
// ...
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as Google Cloud Secret Manager). In this case, the host is the Databricks workspace URL, for example https://1234567890123456.7.gcp.databricks.com
:
import (
"github.com/databricks/databricks-sdk-go"
)
// ...
w := databricks.Must(databricks.NewWorkspaceClient(&databricks.Config{
Host: retrieveWorkspaceUrl(),
}))
// ...
For more information about authenticating with Databricks tools and SDKs that use Go and that implement Databricks client unified authentication, see Authenticate the Databricks SDK for Go with your Databricks account or workspace.
Manually generate and use access tokens for OAuth user-to-machine (U2M) authentication
Databricks tools and SDKs that implement the Databricks client unified authentication standard will automatically generate, refresh, and use Databricks OAuth access tokens on your behalf as needed for OAuth U2M authentication.
If for some reason you must manually generate, refresh, or use Databricks OAuth access tokens for OAuth U2M authentication, follow the instructions in this section.
Step 1: Generate an OAuth code verifier and code challenge pair
To manually generate and use access tokens for OAuth U2M authentication, you must first have an OAuth code verifier and an OAuth code challenge that is derived from the code verifier. You use the code challenge in Step 2 to generate an OAuth authorization code. You use the code verifier and the authorization code in Step 3 to generate the OAuth access token.
Note
While it is technically possible to use unencoded plain-text strings for the code verifier and code challenge, Databricks strongly encourages following the OAuth standard for generating the code verifier and code challenge instead.
Specifically, the code verifier should be a cryptographically random string using characters from the sets A-Z
, a-z
, 0-9
, and the punctuation characters -._~
(hyphen, period, underscore, and tilde), between 43 and 128 characters long. The code challenge should be a Base64-URL-encoded string of the SHA256 hash of the code verifier. For more information, see Authorization Request.
You can run the following Python script to quickly generate a unique code verifier and code challenge pair. While you can reuse this generated code verifier and code challenge pair multiple times, Databricks recommends that you generate a new code verifier and code challenge pair each time that you manually generate access tokens for OAuth U2M authentication.
import uuid, hashlib, base64
# Generate a UUID.
uuid1 = uuid.uuid4()
# Convert the UUID to a string.
uuid_str1 = str(uuid1).upper()
# Create the code verifier.
code_verifier = uuid_str1 + "-" + uuid_str1
# Create the code challenge based on the code verifier.
code_challenge = base64.urlsafe_b64encode(hashlib.sha256(code_verifier.encode()).digest()).decode('utf-8')
# Remove all padding from the code challenge.
code_challenge = code_challenge.replace('=', '')
# Print the code verifier and the code challenge.
# Use these in your calls to manually generate
# access tokens for OAuth U2M authentication.
print(f"code_verifier: {code_verifier}")
print(f"code_challenge: {code_challenge}")
Step 2: Generate an authorization code
You use an OAuth authorization code to generate a Databricks OAuth access token. The authorization code expires immediately after you use it to generate a Databricks OAuth access token. The scope of the authorization code depends on the level that you generate it from. You can generate an authorization code at either the Databricks account level or workspace level, as follows:
To call account-level and workspace-level REST APIs within accounts and workspaces that your Databricks user account has access to, generate an authorization code at the account level.
To call REST APIs within only one workspace that your user account has access to, you can generate an authorization code at the workspace level for only that workspace.
Generate an account-level authorization code
As an account admin, log in to the account console.
Click the down arrow next to your username in the upper right corner.
Copy your Account ID.
In your web browser’s address bar, browse to the following URL. Line breaks have been added for readability. Your URL must not contain these line breaks.
In the following URL, replace the following:
Replace
<account-id>
with the Account ID that you copied.Replace
<redirect-url>
with a redirect URL to your local machine, for examplehttp://localhost:8020
.Replace
<state>
with some plain-text string that you can use to verify the integrity of the authorization code.Replace
<code-challenge>
with the code challenge that you generated in Step 1.
https://accounts.gcp.databricks.com/oidc/accounts/<account-id>/v1/authorize ?client_id=databricks-cli &redirect_uri=<redirect-url> &response_type=code &state=<state> &code_challenge=<code-challenge> &code_challenge_method=S256 &scope=all-apis+offline_access
When prompted, follow the on-screen directions to log in to your Databricks account.
In your web browser’s address bar, copy the authorization code. The authorization code is the full string of characters between
code=
and the&
character in the URL. For example, the authorization code in the following URL isdcod...7fe6
:http://localhost:8020/?code=dcod...7fe6&state=<state>
You should verify the integrity of this authorization code by visually confirming that the
<state>
value in this response URL matches thestate
value that you provided in your request URL. If the values are different, you should not use this authorization code, as it could be compromised.Skip ahead to Generate an account-level access token.
Generate a workspace-level authorization code
In your web browser’s address bar, browse to the following URL. Line breaks have been added for readability. Your URL must not contain these line breaks.
In the following URL, replace the following:
Replace
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.Replace
<redirect-url>
with a redirect URL to your local machine, for examplehttp://localhost:8020
.Replace
<state>
with some plain-text string that you can use to verify the integrity of the authorization code.Replace
<code-challenge>
with the code challenge that you generated in Step 1.
https://<databricks-instance>/oidc/v1/authorize ?client_id=databricks-cli &redirect_uri=<redirect-url> &response_type=code &state=<state> &code_challenge=<code-challenge> &code_challenge_method=S256 &scope=all-apis+offline_access
When prompted, follow the on-screen directions to log in to your Databricks workspace.
In your web browser’s address bar, copy the authorization code. The authorization code is the full string of characters between
code=
and the&
character in the URL. For example, the authorization code in the following URL isdcod...7fe6
:http://localhost:8020/?code=dcod...7fe6&state=<state>
You should verify the integrity of this authorization code by visually confirming that the
<state>
value in this response URL matches thestate
value that you provided in your request URL. If the values are different, you should not use this authorization code, as it could be compromised.
Step 3: Use the authorization code to generate an OAuth access token
You use the OAuth authorization code from the previous step to generate a Databricks OAuth access token, as follows:
To call account-level and workspace-level REST APIs within accounts and workspaces that your Databricks user account has access to, use the account-level authorization code to generate an access token at the account level.
To call REST APIs within only one workspace that your user account has access to, you can use the workspace-level authorization code to generate an access token at the workspace level for only that workspace.
Generate an account-level access token
Use a client such as
curl
along with the account-level authorization code to generate the account-level OAuth access token. In the followingcurl
call, replace the following placeholders:Replace
<account-id>
with the Account ID from Step 2.Replace
<redirect-url>
with the redirect URL from Step 2.Replace
<code-verifier>
with the code verifier that you generated in Step 1.Replace
<authorization-code>
with the account-level authorization code that you generated in Step 2.
curl --request POST \ https://accounts.gcp.databricks.com/oidc/accounts/<account-id>/v1/token \ --data "client_id=databricks-cli" \ --data "grant_type=authorization_code" \ --data "scope=all-apis offline_access" \ --data "redirect_uri=<redirect-url>" \ --data "code_verifier=<code-verifier>" \ --data "code=<authorization-code>"
In the response, copy the account-level OAuth access token. The access token is the full string of characters in the
access_token
object. For example, the access token in the following response iseyJr...Dkag
:{ "access_token": "eyJr...Dkag", "refresh_token": "doau...f26e", "scope": "all-apis offline_access", "token_type": "Bearer", "expires_in": 3600 }
This access token expires in one hour. To generate a new access token, repeat this procedure from Step 1.
Skip ahead to Step 4: Call a Databricks REST API.
Generate a workspace-level access token
Use a client such as
curl
along with the workspace-level authorization code to generate the workspace-level OAuth access token. In the followingcurl
call, replace the following placeholders:Replace
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.Replace
<redirect-url>
with the redirect URL from Step 2.Replace
<code-verifier>
with the code verifier that you generated in Step 1.Replace
<authorization-code>
with the workspace-level authorization code that you generated in Step 2.
curl --request POST \ https://<databricks-instance>/oidc/v1/token \ --data "client_id=databricks-cli" \ --data "grant_type=authorization_code" \ --data "scope=all-apis offline_access" \ --data "redirect_uri=<redirect-url>" \ --data "code_verifier=<code-verifier>" \ --data "code=<authorization-code>"
In the response, copy the workspace-level OAuth access token. The access token is the full string of characters in the
access_token
object. For example, the access token in the following response iseyJr...Dkag
:{ "access_token": "eyJr...Dkag", "refresh_token": "doau...f26e", "scope": "all-apis offline_access", "token_type": "Bearer", "expires_in": 3600 }
This access token expires in one hour. To generate a new access token, repeat this procedure from Step 1.
Step 4: Call a Databricks REST API
You use the account-level or workspace-level OAuth access token to authenticate to Databricks account-level REST APIs and workspace-level REST APIs, depending on the access token’s scope. Your Databricks user account must be an account admin to call account-level REST APIs.
Example account-level REST API request
This example uses curl
along with Bearer
authentication to get a list of all workspaces associated with an account.
Replace
<oauth-access-token>
with the account-level OAuth access token.Replace
<account-id>
with your account ID.
export OAUTH_TOKEN=<oauth-access-token>
curl --request GET --header "Authorization: Bearer $OAUTH_TOKEN" \
"https://accounts.gcp.databricks.com/api/2.0/accounts/<account-id>/workspaces"
Example workspace-level REST API request
This example uses curl
along with Bearer
authentication to list all available clusters in the specified workspace.
Replace
<oauth-access-token>
with the account-level or workspace-level OAuth access token.Replace
<databricks-instance>
with the Databricks workspace instance name, for example1234567890123456.7.gcp.databricks.com
.
export OAUTH_TOKEN=<oauth-access-token>
curl --request GET --header "Authorization: Bearer $OAUTH_TOKEN" \
"https://<databricks-instance>/api/2.0/clusters/list"