Databricks CLI tutorial

Note

This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v.

The Databricks command-line interface (also known as the Databricks CLI) utility provides an easy-to-use interface to automate the Databricks platform from your terminal, command prompt, or automation scripts.

This article demonstrates how to use your local development machine to get started quickly with the Databricks CLI. See What is the Databricks CLI?.

The following hands-on tutorial assumes:

Complete the following steps:

  1. If it is not already installed, install the Databricks CLI as follows:

    Use Homebrew to install the Databricks CLI by running the following two commands:

    brew tap databricks/tap
    brew install databricks
    

    You can use winget, Chocolatey or Windows Subsystem for Linux (WSL) to install the Databricks CLI. If you cannot use winget, Chocolatey, or WSL, you should skip this procedure and use the Command Prompt or PowerShell to install the Databricks CLI from source instead.

    Note

    Installing the Databricks CLI with Chocolatey is Experimental.

    To use winget to install the Databricks CLI, run the following two commands, and then restart your Command Prompt:

    winget search databricks
    winget install Databricks.DatabricksCLI
    

    To use Chocolatey to install the Databricks CLI, run the following command:

    choco install databricks-cli
    

    To use WSL to install the Databricks CLI:

    1. Install curl and zip through WSL. For more information, see your operating system’s documentation.

    2. Use WSL to install the Databricks CLI by running the following command:

      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
      
  2. Confirm that the Databricks CLI is installed by running the following command, which displays the current version of the installed Databricks CLI. This version should be 0.205.0 or above:

    databricks -v
    

    Note

    If you run databricks but get an error such as command not found: databricks, or if you run databricks -v and a version number of 0.18 or below is listed, this means that your machine cannot find the correct version of the Databricks CLI executable. To fix this, see Verify your CLI installation.

After you install the Databricks CLI, complete the following steps:

Note

This tutorial assumes that you want to use OAuth user-to-machine (U2M) authentication to authenticate the CLI using your Databricks user account. To configure the CLI to use other Databricks authentication types, see Authentication for the Databricks CLI.

  1. Use the Databricks CLI to initiate OAuth token management locally by running the following command for each target account or workspace.

    For account-level operations, in the following command, replace the following placeholders:

    databricks auth login --host <account-console-url> --account-id <account-id>
    

    For workspace-level operations, in the following command, replace <workspace-url> with your Databricks workspace instance URL, for example https://1234567890123456.7.gcp.databricks.com.

    databricks auth login --host <workspace-url>
    
  2. The Databricks CLI prompts you to save the information that you entered as a Databricks configuration profile. Press Enter to accept the suggested profile name, or enter the name of a new or existing profile. Any existing profile with the same name is overwritten with the information that you entered. You can use profiles to quickly switch your authentication context among multiple accounts or workspaces.

    To get a list of any existing profiles, in a separate terminal or command prompt, use the Databricks CLI to run the command databricks auth profiles. To view a specific profile’s existing settings, run the command databricks auth env --profile <profile-name>.

  3. In your web browser, complete the on-screen instructions to log in to your Databricks account or workspace.

  4. To view a profile’s current OAuth token value and the token’s upcoming expiration timestamp, run one of the following commands:

    For account-level operations, run the following commands:

    • databricks auth token -p <profile-name>

    • databricks auth token --host <workspace-url> --account-id <account-id>

    • databricks auth token --host <workspace-url> --account-id <account-id> -p <profile-name>

    If you have multiple profiles with the same --host and --account-id values, you might need to specify the --host, --account-id, and -p options together to help the Databricks CLI find the correct matching OAuth token information.

    For workspace-level operations, run the following commands:

    • databricks auth token -p <profile-name>

    • databricks auth token --host <workspace-url>

    • databricks auth token --host <workspace-url> -p <profile-name>

    If you have multiple profiles with the same --host values, you might need to specify the --host and -p options together to help the Databricks CLI find the correct matching OAuth token information.

Next steps

After you set up the Databricks CLI:

Run the Databricks CLI on a cluster

If you want to install, configure, and run the Databricks CLI on a Databricks cluster instead of on your local machine, the fastest way to do this is with a Databricks personal access token is as follows:

Create a Databricks personal access token for your Databricks user account for the target Databricks workspace as follows:

  1. In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.

  2. Click Developer.

  3. Next to Access tokens, click Manage.

  4. Click Generate new token.

  5. (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).

  6. Click Generate.

  7. Copy the displayed token to a secure location, and then click Done.

Note

Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.

If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following:

After you create the personal access token, do the following:

  1. In the Databricks workspace user interface, on the sidebar, click Compute.

  2. Click the name of the existing cluster that you want to install the Databricks CLI on.

  3. Click Start, if the cluster is not already running.

  4. After the cluster is running, on the Apps tab, click Web Terminal. A Bash-style terminal appears, and curl is already installed.

  5. Run the following command to use curl to install the CLI:

    curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
    

    Note

    To view the script’s contents before you run it, see the install.sh file in the Databricks CLI Setup repository in GitHub.

  6. Run the CLI’s configure command to configure authentication between the CLI and your workspace:

    databricks configure
    
  7. At the first prompt, Databricks Host: https://, enter your workspace URL and press Enter.

  8. At the second prompt, Personal Access Token, enter your personal access token value and press Enter.

Note that whenever you click Start to start the cluster, you must then reinstall and reconfigure the CLI on the cluster. This is because a new virtual machine is provisioned each time you click Start, and the new virtual machine does not include the CLI by default.