Configure access to cloud storage

This article describes how Databricks SQL administrators configure a new workspace for access to data objects.

Note

  • If you are using Databricks managed tables you do not need to configure access to cloud storage.

  • Databricks SQL warehouses all share the same cloud storage access credentials.

To configure data access for Databricks SQL, follow the steps in this section:

Requirements

  • Databricks account on the Premium plan

  • A Databricks SQL warehouse

  • Groups representing users who you will give access to data

Step 1: Create or reuse an service account for GCS buckets

Databricks recommends setting up a new service account with access to all GCS buckets that should be accessed from Databricks SQL.

A Databricks administrator performs one of the following steps in the Google Cloud console:

  • (Optional) Create a service account to access GCS buckets. If you want to reuse an existing service account, you can skip this step.

  • If you are reusing a service account, get the service account email address from the Google Cloud console (or from your Databricks cluster configuration).

Step 2: Give the service account access to GCS buckets

A Databricks administrator performs the following steps in the Google Cloud console.

  1. If you don’t have a bucket, create a bucket.

  2. Configure the bucket so that your service account has permission to access the data in the bucket. Repeat this step for each bucket you want to access from Databricks SQL.

Step 3: Configure Databricks SQL to use the service account for data access

A Databricks administrator performs this step in the SQL admin console:

  1. In the sidebar, use the persona selector to select SQL.

  2. Click User Settings Icon Settings at the bottom of the sidebar and select SQL Admin Console.

  3. Click the SQL Warehouse Settings tab.

  4. Add the Google Service Account email to configure data access.

  5. Click Save.

Step 5: (Optional) Set owner

A Databricks administrator performs this step in a notebook in a Data Science & Engineering workspace.

Administrators set owners using ALTER TABLE statements. The simplest option is to set the owner to a group of admins. Alternatively, to enable a delegated security model, you can select different owners for each database, giving each the ability to manage permissions on the objects in the database.