Configure data access for ingestion
This article describes how admin users can configure access to data in a bucket in Google Cloud Storage (GCS) so that Databricks users can load data from GCS into a table in Databricks.
This article describes the following ways to configure secure access to source data:
(Recommended) Create a Unity Catalog volume.
Create a Unity Catalog external location with a storage credential.
Before you begin
Before you configure access to data in GCS, make sure you have the following:
Data in a GCS bucket in your Google Cloud service account.
To access data using a Unity Catalog volume (recommended), the
READ VOLUME
privilege on the volume. For more information, see What are Unity Catalog volumes? and Unity Catalog privileges and securable objects.To access data using a Unity Catalog external location, the
READ FILES
privilege on the external location. For more information, see Create an external location to connect cloud storage to Databricks.
A Databricks SQL warehouse. To create a SQL warehouse, see Create a SQL warehouse.
Familiarity with the Databricks SQL user interface.
Configure access to cloud storage
Use one of the following methods to configure access to GCS:
(Recommended) Create a Unity Catalog volume. For more information, see What are Unity Catalog volumes?.
Configure a Unity Catalog external location with a storage credential. For more information about external locations, see Create an external location to connect cloud storage to Databricks.
Clean up
You can clean up the associated resources in your cloud account and Databricks if you no longer want to keep them.
Next steps
After you complete the steps in this article, users can run the COPY INTO
command to load the data from the GCS bucket into your Databricks workspace.
To load data using a Unity Catalog volume or external location, see Load data using COPY INTO with Unity Catalog volumes or external locations.