Automate Unity Catalog setup using Terraform
You can automate Unity Catalog setup by using the Databricks Terraform provider. This article provides links to the Terraform provider Unity Catalog deployment guide and resource reference documentation, along with requirements (“Before you begin”) and validation and deployment tips.
Before you begin
To automate Unity Catalog setup using Terraform, you must have the following:
Your Databricks account must be on the Premium plan.
In Google Cloud, you must have the ability to create GCS buckets and assign permissions to the GCS buckets you create.
You must have at least one Databricks workspace that you want to use with Unity Catalog. See Create a workspace using the account console.
To use the Databricks Terraform provider to configure a metastore for Unity Catalog, storage for the metastore, any external storage, and all of their related access credentials, you must have the following:
A Google Cloud account.
A Google Cloud project in the account.
Use the Databricks Terraform provider 1.8.0 or higher. Always use the latest version of the provider.
A Databricks on Google Cloud account in the project.
A Google Account and a Google service account with the required permissions.
On your local development machine, you must have:
The Terraform CLI. See Download Terraform on the Terraform website.
The Google Cloud SDK, signed in through the gcloud auth application-default login –project=<project-id> command, where
<project-id>
is the ID of the target Google Cloud project. For more details, see Installing Google Cloud SDK and Authorize the gcloud CLI on the Google Cloud website.
To use the Databricks Terraform provider to configure all other Unity Catalog infrastructure components, you must have the following:
A Databricks workspace.
A Databricks personal access token, to allow Terraform to call the Databricks APIs within your Databricks workspace. See also Monitor and manage access to personal access tokens.
On your local development machine, you must have:
The Terraform CLI. See Download Terraform on the Terraform website.
One of the following:
Databricks CLI version 0.205 or above, configured with your Databricks personal access token by running
databricks configure --host <workspace-url> --profile <some-unique-profile-name>
. See Install or update the Databricks CLI and Databricks personal access token authentication.The following two Databricks environment variables:
DATABRICKS_HOST
, set to the value of your Databricks workspace instance URL, for examplehttps://1234567890123456.7.gcp.databricks.com
DATABRICKS_TOKEN
, set to the value of your Databricks personal access token. See also Monitor and manage access to personal access tokens.
To set these environment variables, see your operating system’s documentation.
Note
As a security best practice, when you authenticate with automated tools, systems, scripts, and apps, Databricks recommends that you use personal access tokens belonging to service principals instead of workspace users. To create tokens for service principals, see Manage tokens for a service principal.
Terraform provider Unity Catalog deployment guide and resource reference documentation
To learn how to deploy all prerequisites and enable Unity Catalog for a workspace, see Deploying pre-requisite resources and enabling Unity Catalog in the Databricks Terraform provider documentation.
If you already have some Unity Catalog infrastructure components in place, you can use Terraform to deploy additional Unity Catalog infrastructure components as needed. See each section of the guide referenced in the previous paragraph and the Unity Catalog section of the Databricks Terraform provider documentation.
Validate, plan, deploy, or destroy the resources
To validate the syntax of the Terraform configurations without deploying them, run the
terraform validate
command.To show the actions that Terraform would take to deploy the configurations, run the
terraform plan
command. This command does not actually deploy the configurations.To deploy the configurations, run the
terraform deploy
command.To delete the deployed resources, run the
terraform destroy
command.