Enable data access configuration

This article describes the data access configurations performed by Databricks administrators for all SQL warehouses using the UI.

Note

If your workspace is enabled for Unity Catalog, you don’t need to perform the steps in this article. Unity Catalog supports SQL warehouses by default.

Databricks recommends using Unity Catalog volumes or external locations to connect to cloud object storage instead of instance profiles. Unity Catalog simplifies the security and governance of your data by providing a central place to administer and audit data access across multiple workspaces in your account. See What is Unity Catalog? and Recommendations for using external locations.

To configure all SQL warehouses using the REST API, see SQL Warehouses API.

Important

Changing these settings restarts all running SQL warehouses.

For a general overview of how to enable access to data, see Access control lists.

Requirements

  • You must be a Databricks workspace admin to configure settings for all SQL warehouses.

You can also edit the Data Access Configuration textbox entries directly.

Configure a Google Cloud service account

To configure all warehouses to use a Google Cloud service account when accessing Google Cloud Storage (GCS):

  1. In Google Cloud Platform, create a service account that has permissions on the underlying Google Cloud Platform services required to access your Google Cloud Storage assets.

  2. Navigate to the admin settings page.

  3. Click the Compute tab.

  4. Click Manage next to SQL warehouses.

  5. In the Google Service Account field, enter the email address of the service account whose identity will be used to launch all SQL warehouses.

    All queries running on these warehouses will have access to underlying Google Cloud Platform services scoped to the permissions granted to this service account in Google Cloud Platform.

  6. Click Save.

Configure data access properties for SQL warehouses

  1. Click your username in the top bar of the workspace and select Admin Settings from the drop-down.

  2. Click the Compute tab.

  3. Click Manage next to SQL warehouses.

  4. In the Data Access Configuration textbox, specify key-value pairs containing metastore properties.

    Important

    To set a Spark configuration property to the value of a secret without exposing the secret value to Spark, set the value to {{secrets/<secret-scope>/<secret-name>}}. Replace <secret-scope> with the secret scope and <secret-name> with the secret name. The value must start with {{secrets/ and end with }}. For more information about this syntax, see Syntax for referencing secrets in a Spark configuration property or environment variable.

  5. Click Save.

You can also configure data access properties using the Databricks Terraform provider and databricks_sql_global_config.

Supported properties

  • For an entry that ends with *, all properties within that prefix are supported.

    For example, spark.sql.hive.metastore.* indicates that both spark.sql.hive.metastore.jars and spark.sql.hive.metastore.version are supported, and any other properties that start with spark.sql.hive.metastore.

  • For properties whose values contain sensitive information, you can store the sensitive information in a secret and set the property’s value to the secret name using the following syntax: secrets/<secret-scope>/<secret-name>.

The following properties are supported for SQL warehouses:

  • spark.databricks.hive.metastore.glueCatalog.enabled

  • spark.sql.hive.metastore.*

  • spark.sql.warehouse.dir

  • spark.hadoop.datanucleus.*

  • spark.hadoop.fs.*

  • spark.hadoop.hive.*

  • spark.hadoop.javax.jdo.option.*

  • spark.hive.*

For more information about how to set these properties, see External Hive metastore.