Set up and use Google Cloud ID authentication
Follow this article’s steps to authenticate Google Cloud service accounts to automate your Databricks accounts and workspaces.
Google Cloud service accounts are a special kind of Google Cloud account typically used by an application, rather than a person. A service account is identified by its email address, which is unique to the account. See Service accounts overview.
Note
Google Cloud service accounts are different than Databricks service principals. Choosing whether to use a Google Cloud service account or a Databricks service principal might depend on your organization’s security preferences or policies. To learn how to use Databricks service principals for Databricks authentication instead of Google Cloud service accounts, see Manage service principals.
Databricks provides two approaches to authenticating Google Cloud service accounts with Databricks:
Google Cloud ID authentication, which uses a Google Cloud service account’s email address for authentication. This article describes how to use a Google Cloud service account’s email address for authentication. See also Google Cloud ID authentication.
Google Cloud credentials authentication, which uses Google-managed key pairs for authentication. For more information, see Set up and use Google Cloud credentials authentication. See also Service account credentials and Google Cloud credentials authentication.
This article demonstrates how to set up and use Google Cloud ID authentication as follows:
Create a Google Cloud service account.
Assign your Google Cloud service account to your Databricks account and to a Databricks workspace in that account.
Install the Google Cloud command-line interface (Google Cloud CLI) and then authorize the Google Cloud CLI to use your login to impersonate the Google Cloud service account.
Install the Databricks CLI on your local development machine and then configure the Databricks CLI for Google Cloud ID authentication.
Run commands with the Databricks CLI to automate your Databricks account and workspace by using Google Cloud ID authentication, or both.
Requirements
To create a Google Cloud service account, you must have the Create Service Accounts IAM role for your Google project. See Required roles.
To assign a Google Cloud service account to your Databricks account, you must be an admin of that account. See Assign account admin roles to a user.
To assign a Google Cloud service account to your Databricks workspace, you must be an admin of that workspace. See Assign the workspace admin role to a user using the workspace admin settings page.
Step 1: Create a Google Cloud service account
In this step, you create a Google Cloud service account for your target Google project in the Google Cloud console.
Sign in to the Google Cloud console.
If you have access to multiple projects, switch to the target project. To do this, in the top navigation bar, next to the Google Cloud logo, click the project selector. Then select the project’s name in the list.
In Search (/) for resources, docs, products, and more, search for and select Service Accounts.
Click + Create Service Account.
In the Service account details section, for Service account name, enter some unique name for the service account that’s easy for you to remember.
Make a note of the Email address below the Service account ID box, as you will need it in Steps 2, 3, 4, 5, and 7. It will look something like the following:
<your-service-account-name>@<your-project-name>.iam.gserviceaccount.com
Optionally, for Service account description, enter some meaningful description about the service account.
Click Create and continue.
Click Done.
Step 2: Assign your Google Cloud service account to your Databricks account
In this step, you give your Google Cloud service account access to your Databricks account. If you do not want to give your service account access to your Databricks account, skip ahead to Step 3.
In your Databricks workspace, click your username in the top bar and click Manage account.
Alternatively, go directly to your Databricks account console, at https://accounts.gcp.databricks.com.
Sign in to your Databricks account, if prompted.
On the sidebar, click User management.
Click the Users tab.
Note
Although this tab is labeled Users, this tab works with service accounts as well. Databricks treats service accounts as users in your Databricks account.
Click Add user.
For Email, enter the Email address that you copied from Step 1 for your service account.
For First name and Last name, enter some meaningful text to help you search for the service account later. For example, for First name you could enter the Service account name from Step 1. For Last name, you could enter Google Cloud Service Account.
Click Add user. Databricks adds the service account as a user to your Databricks account.
Assign any account-level permissions that you want the user to have:
On the Users tab, click the name of the user. If the username is not visible, use Filter users to find it.
On the Roles tab, toggle to enable or disable each target role that you want this user to have. See Assign account admin roles to a user.
Step 3: Assign your Google Cloud service account to your Databricks workspace
In this step, you give your Google Cloud service account access to your Databricks workspace.
If your workspace is enabled for identity federation:
In your Databricks workspace, click your username in the top bar and click Settings.
Click Users.
Note
Although this tab is labelled Users, this tab works with service accounts as well. Databricks treats service accounts as users in your Databricks workspace.
Click Add user.
Select the user from Step 2 and click Add. The service account is added as a user in your Databricks workspace.
Assign any workspace-level permissions that you want the user to have:
On the Users tab, click the name of the user.
On the Entitlements tab, select or clear to grant or revoke each target status or entitlement that you want this user to have. For more information, see:
Skip ahead to Step 4.
If your workspace is not enabled for identity federation:
In your Databricks workspace, click your username in the top bar and click Settings.
Click Users.
Note
Although this tab is labelled Users, this tab works with service accounts as well. Databricks treats service accounts as users in your Databricks workspace.
Click Add new.
For New user email, enter the Email address that you copied from Step 1 for your service account.
Click Add. The service account is added as a user in your Databricks workspace.
Assign any workspace-level permissions that you want the user to have:
On the Users tab, click the name of the user.
On the Entitlements tab, select or clear to grant or revoke each target status or entitlement that you want this user to have. For more information, see:
Step 4: Install the Google Cloud CLI on your local development machine
Install the Google Cloud CLI by following the instructions in Install the gcloud CLI.
Step 5: Impersonate the Google Cloud service account
In this step, you use your Google Cloud login to automate Databricks through your Google Cloud service account, by using a technique called impersonation. For more information see, Service account impersonation.
To impersonate the service account, you must give your Google Cloud user permissions to impersonate service accounts. You then initiate the impersonation through the Google Cloud CLI.
Give your Google Cloud user permissions to impersonate service accounts: in the Google Cloud console that you signed in to from Step 1, in Search (/) for resources, docs, products, and more, search for and select IAM.
On the Permissions tab, in the View By Principals tab, click Grant Access.
For New Principals, enter and select your Google Cloud username. (Do not enter your Google Cloud service account’s name here.)
Click Select a role, and enter and select
Service Account Token Creator
.Click Add Another Role.
Click Select a role, and enter and select
Service Account User
.Click Service Account Token Creator.
Click Save.
Initiate the impersonation: use the Google Cloud CLI to run the following command, replacing
<your-service-account-name>@<your-project-name>.iam.gserviceaccount.com
with the Email address that you copied from Step 1 for your service account.gcloud auth login --impersonate-service-account=<your-service-account-name>@<your-project-name>.iam.gserviceaccount.com
In your web browser, sign in with your Google Cloud user account by following the on-screen sign-in instructions.
Step 6: Install the Databricks CLI on your local development machine
In this step, you install the Databricks CLI so that you can use it to run commands that automate your Databricks accounts and workspaces.
Tip
You can also use the Databricks Terraform provider or the Databricks SDK for Go along with Google Cloud ID authentication to automate your Databricks accounts and workspaces by running HCL or Go code. See the Databricks SDK for Go and Google Cloud ID authentication.
If it is not already installed, install the Databricks CLI as follows:
Use Homebrew to install the Databricks CLI by running the following two commands:
brew tap databricks/tap brew install databricks
You can use winget, Chocolatey or Windows Subsystem for Linux (WSL) to install the Databricks CLI. If you cannot use
winget
, Chocolatey, or WSL, you should skip this procedure and use the Command Prompt or PowerShell to install the Databricks CLI from source instead.Note
Installing the Databricks CLI with Chocolatey is Experimental.
To use
winget
to install the Databricks CLI, run the following two commands, and then restart your Command Prompt:winget search databricks winget install Databricks.DatabricksCLI
To use Chocolatey to install the Databricks CLI, run the following command:
choco install databricks-cli
To use WSL to install the Databricks CLI:
Install
curl
andzip
through WSL. For more information, see your operating system’s documentation.Use WSL to install the Databricks CLI by running the following command:
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
Confirm that the Databricks CLI is installed by running the following command, which displays the current version of the installed Databricks CLI. This version should be 0.205.0 or above:
databricks -v
Note
If you run
databricks
but get an error such ascommand not found: databricks
, or if you rundatabricks -v
and a version number of 0.18 or below is listed, this means that your machine cannot find the correct version of the Databricks CLI executable. To fix this, see Verify your CLI installation.
Step 7: Configure the Databricks CLI for Google Cloud ID authentication
In this step, you set up the Databricks CLI to use Google Cloud ID authentication for Databricks by using your Google Cloud service account’s name. To do this, you create a file with a default filename and in a default location that the Databricks CLI expects to find the authentication settings that it needs.
With your favorite text editor, create a local file named
.databrickscfg
in your user’s home directory, if it does not already exist. For Linux and macOS, your user home directory is~
. For Windows, your user home directory is%USERPROFILE%
.Enter the following content into the
.databrickscfg
file. In this content, replace the following values:Replace
<account-console-url>
with your Databricks account console URL, such as https://accounts.gcp.databricks.com.Replace
<account-id>
with your Databricks account ID. See Locate your account ID.Replace
<google-cloud-service-account-email-address>
with the Email address that you copied from Step 1 for your service account.Replace
<workspace-url>
with your workspace instance URL, for examplehttps://1234567890123456.7.gcp.databricks.com
.You can replace the suggested configuration profile names
GCP_ID_ACCOUNT
andGCP_ID_WORKSPACE
with different configuration profile names if desired. These specific names are not required.
If you do not want to run account-level operations, you can omit the
[GCP_ID_ACCOUNT]
section in the following content.[GCP_ID_ACCOUNT] host = <account-console-url> account_id = <account-id> google_service_account = <google-cloud-service-account-email-address> [GCP_ID_WORKSPACE] host = <workspace-url> google_service_account = <google-cloud-service-account-email-address>
Step 8: Run an account-level command with the Databricks CLI
In this step, you use the Databricks CLI and Google Cloud ID authentication to run a command that automates the Databricks account that was configured in Step 7. This step assumes that your Google Cloud user account is currently impersonating the service account as described previously in Step 5.
If you do not want to run account-level commands, skip ahead to Step 9.
With the terminal or command prompt still open from Step 6, run the following command to list all available users in your Databricks account. If you renamed GCP_ID_ACCOUNT
in Step 7, be sure to replace it here.
databricks account users list -p GCP_ID_ACCOUNT
Step 9: Run a workspace-level command with the Databricks CLI
In this step, you use the Databricks CLI and Google Cloud credentials authentication to run a command that automates the Databricks account that was configured in Step 7. This step assumes that your Google Cloud user account is currently impersonating the service account as described previously in Step 5.
With the terminal or command prompt still open from Step 6, run the following command to list all available users in your Databricks workspace. If you renamed GCP_ID_WORKSPACE
in Step 7, be sure to replace it here.
databricks users list -p GCP_ID_WORKSPACE
Step 10: Clean up
This step is optional. If you no longer want to keep using the Google Cloud service account that you created for this article, this step describes how to delete the service account from your Google project and your Databricks account and workspace.
Delete the service account from your Google project
In the Google Cloud console that you signed in to from Step 1, in Search (/) for resources, docs, products, and more, search for and select Service Accounts.
In the row for your service account’s name, click the ellipses. If your service account’s name is not visible, use Enter property name or value to find it.
Click Delete.
In the confirmation dialog, click Delete.
Delete the service account from your Databricks account
In your Databricks account, on the sidebar, click User management.
Click the Users tab.
Click the name of the service account that you added in Step 2. If the service accounts’s name is not visible, use Filter users to find it.
Click the ellipses button, and then click Delete user.
Click Confirm delete.
Delete the service account from your Databricks workspace
In your Databricks workspace, click your username in the top bar and click Settings.
Click the User tab.
Click the name of the service account that you added in Step 3. If the service account’s name is not visible, use Filter users to find it.
Click Remove user.
In the confirmation dialog, click Delete.