Authentication for Databricks automation
In Databricks, authentication refers to verifying a Databricks identity (such as a user, service principal, or group). Databricks uses credentials (such as an access token or a username and password) to verify the identity.
After Databricks verifies the caller’s identity, Databricks then uses a process called authorization to determine whether the verified identity has sufficient access permissions to perform the specified action on the resource at the given location. This article includes details only about authentication. It does not include details about authorization or access permissions; see Access control.
When a tool makes an automation or API request, it includes credentials that authenticate an identity with Databricks. This article describes typical ways to create, store, and pass credentials and related information that Databricks needs to authenticate and authorize requests. To learn which credential types, related information, and storage mechanism are supported by your tools, scripts, and apps, see your provider’s documentation.
Databricks personal access tokens
Databricks personal access tokens are one of the most well-supported types of credentials for resources and operations at the Databricks workspace level. Many storage mechanisms for credentials and related information, such as environment variables and configuration profiles, provide support for Databricks personal access tokens. Although a Databricks workspace can have multiple personal access tokens, each personal access token works for only a single Databricks workspace.
Note
Databricks supports Google ID tokens in addition to Databricks personal access tokens. To learn whether Google ID tokens are supported by your tools, scripts, and apps, see your provider’s documentation.
You use Databricks personal access tokens or Google ID workspace-level tokens for credentials when automating Databricks workspace-level functionality. To automate Databricks account-level functionality, you cannot use Databricks personal access tokens or Google ID workspace-level tokens. Instead, you use the Google ID account-level tokens of Databricks account-level admins only. Databricks account-level admins are account-level Google service accounts acting as account-level admin users. For more information, see Authentication with Google ID tokens and the Account API 2.0.
To create a Databricks personal access token for a Databricks user, do the following:
In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
On the Access tokens tab, click Generate new token.
(Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
Click Generate.
Copy the displayed token, and then click Done.
Important
Be sure to save the copied token in a secure location. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, Databricks recommends that you immediately delete that token from your workspace by clicking the X next to the token on the Access tokens tab.
Personal access tokens for service principals
To create a Databricks personal access token for a Databricks service principal instead of a Databricks user, see Manage access tokens for a service principal.
Managing personal access tokens
For information about enabling and disabling all Databricks personal access tokens for a workspace, controlling who can use tokens in a workspace, setting a maximum lifetime for tokens in a workspace, and other token management operations for a workspace, see Manage personal access tokens.
Environment variables
Databricks supported products, and a few third-party products that work with Databricks, support some of the following unique environment variables. To learn which of these unique environment variables are supported by your tools, scripts, and apps, see your provider’s documentation. To create, change, and delete environment variables, see your operating system’s documentation.
Environment variable |
---|
The ID of a Databricks account. Applies only to the Databricks Terraform provider. |
The URL of a Databricks workspace. For operations at the Databricks account level, the URL to the Databricks account console. Examples: Applies to Databricks Connect only. |
The value of a Databricks personal access token. Applies to Databricks Connect only. |
The ID of a Databricks cluster. Applies to Databricks Connect only. |
The full path to a Databricks configuration profiles file. Default: |
The name of a Databricks configuration profile. Default: |
Whether debug HTTP headers of requests made by the provider are output. Default: Applies to the Databricks Terraform provider only. |
Truncate the length of JSON fields in HTTP requests and responses above this limit. Default: Applies to the Databricks Terraform provider only. |
The data source name (DSN) connection string to a Databricks compute resource. Applies to the Databricks SQL Driver for Go only. |
The URL to a Databricks workspace. For operations at the Databricks account level, the URL to the Databricks account console. Examples: |
The organization ID of a Databricks workspace. Applies to Databricks Connect only. |
The password of a Databricks workspace user. |
The port number to communicate with a Databricks cluster. Applies to Databricks Connect only. |
The maximum number of requests per second. Default: Applies to the Databricks Terraform provider only. |
The value of a Databricks personal access token. |
The username of a Databricks workspace user. |
The value of a Databricks personal access token. Applies to the Databricks SQL CLI only. |
The value of the Server hostname field for a Databricks SQL warehouse. Examples: Applies to the Databricks SQL CLI only. |
The value of the HTTP path field for a Databricks SQL warehouse. Example: Applies to the Databricks SQL CLI only. |
The value of a Databricks personal access token. Applies to the Apache Airflow integration with Databricks only. |
Configuration profiles
A Databricks configuration profile contains settings and other information that Databricks needs to authenticate. Databricks configuration profiles are stored in Databricks configuration profiles files for your tools, scripts, and apps to use. To learn whether Databricks configuration profiles are supported by your tools, scripts, and apps, see your provider’s documentation.
Use your favorite text editor to create a file named
.databrickscfg
in your~
(your user home) folder on Unix, Linux, or macOS, or your%USERPROFILE%
(your user home) folder on Windows. Do not forget the dot (.
) at the beginning of the file name. Add the following contents to this file:[<DEFAULT>] host = <your-workspace-url> token = <your-personal-access-token>
In the preceding contents, replace the following values, and then save the file:
<DEFAULT>
with a unique name for the configuration profile, such asDEFAULT
,DEV
,PROD
, or similar.<your-workspace-url>
with your workspace instance URL, for examplehttps://1234567890123456.7.gcp.databricks.com
.<your-personal-access-token>
with your Databricks personal access token.
For example, the
.databrickscfg
file might look like this:[DEFAULT] host = https://1234567890123456.7.gcp.databricks.com token = dapi12345678901234567890123456789012
Tip
You can create additional configuration profiles by specifying different profile names within the same
.databrickscfg
file, for example:[DEFAULT] host = https://1234567890123456.7.gcp.databricks.com token = dapi12345678901234567890123456789012 [DEV] host = https://2345678901234567.8.gcp.databricks.com token = dapi23456789012345678901234567890123
ODBC DSNs
In ODBC, a data source name (DSN) is a symbolic name that tools, scripts, and apps use to request a connection to an ODBC data source. A DSN stores connection details such as the path to an ODBC driver, networking details, authentication credentials, and database details. To learn whether ODBC DSNs are supported by your tools, scripts, and apps, see your provider’s documentation.
To install and configure the Databricks ODBC Driver and create an ODBC DSN for Databricks, see ODBC driver.
JDBC connection URLs
In JDBC, a connection URL is a symbolic URL that tools, scripts, and apps use to request a connection to a JDBC data source. A connection URL stores connection details such as networking details, authentication credentials, database details, and JDBC driver capabilities. To learn whether JDBC connection URLs are supported by your tools, scripts, and apps, see your provider’s documentation.
To install and configure the Databricks JDBC Driver and create a JDBC connection URL for Databricks, see JDBC driver.
Google Cloud CLI
The Google Cloud CLI enables you to authenticate with Databricks on Google Cloud through your terminal for Linux or macOS, or through PowerShell or your Command Prompt for Windows. To learn whether the Google Cloud CLI is supported by your tools, scripts, and apps, see your provider’s documentation.
To use the Google Cloud CLI to authenticate with Databricks on Google Cloud, run the gcloud init command, and then follow the on-screen prompts:
gcloud init
For more detailed authentication options, see Initializing the gcloud CLI.