Create a storage credential for connecting to Google Cloud Storage

This article describes how to create a storage credential in Unity Catalog to connect to Google Cloud Storage.

To manage access to the underlying cloud storage that holds tables and volumes, Unity Catalog uses the following object types:

  • Storage credentials encapsulate a long-term cloud credential that provides access to cloud storage.

  • External locations contain a reference to a storage credential and a cloud storage path.

For more information, see Connect to cloud object storage using Unity Catalog.

Unity Catalog supports two cloud storage options for Databricks on Google Cloud: Google Cloud Storage (GCS) buckets and Cloudflare R2 buckets. Cloudflare R2 is intended primarily for Delta Sharing use cases in which you want to avoid data egress fees. GCS is appropriate for most other use cases. This article focuses on creating storage credentials for GCS. For Cloudflare R2, see Create a storage credential for connecting to Cloudflare R2.

To create a storage credential for access to a GCS bucket, you give Unity Catalog the ability to read and write to the bucket by assigning IAM roles on that bucket to a Databricks-generated Google Cloud service account.

Requirements

In Databricks:

  • Databricks workspace enabled for Unity Catalog.

  • CREATE STORAGE CREDENTIAL privilege on the Unity Catalog metastore attached to the workspace. Account admins and metastore admins have this privilege by default.

In your Google Cloud account:

  • A GCS bucket in the same region as the workspaces you want to access the data from.

  • Permission to modify the access policy for that bucket.

Generate a Google Cloud service account using Catalog Explorer

  1. Log in to your Unity Catalog-enabled Databricks workspace as a user who has the CREATE STORAGE CREDENTIAL privilege on the metastore.

    The metastore admin and account admin roles both include this privilege.

  2. In the sidebar, click Catalog icon Catalog.

  3. At the bottom of the screen, click Storage Credentials.

  4. Click the +Add button and select Add a storage credential from the menu.

    This option does not appear if you don’t have the CREATE STORAGE CREDENTIAL privilege.

  5. On the Create a new storage credential dialog, elect a Credential Type of Google Cloud Storage.

  6. Enter a Storage credential name and an optional comment.

  7. (Optional) If you want users to have read-only access to the external locations that use this storage credential, select Read only. For more information, see Mark a storage credential as read-only.

  8. Click Save.

    Databricks creates the storage credential and generates a Google Cloud service account.

  9. On the Storage credential created dialog, make a note of the service account ID, which is in the form of an email address, and click Done.

Configure permissions for the service account

  1. Go to the Google Cloud console and open the GCS bucket that you want to access from your Databricks workspace.

    The bucket should be in the same region as your Databricks workspace.

  2. On the Permission tab, click + Grant access and assign the service account the following roles:

    • Storage Legacy Bucket Reader

    • Storage Object Admin

    Use the service account’s email address as the principal identifier.

You can now create an external location that references this storage credential.

Next steps

You can view, update, delete, and grant other users permission to use storage credentials. See Manage storage credentials.

You can define external locations using storage credentials. See Create an external location to connect cloud storage to Databricks.