This article explains how to configure and manage SQL warehouses using the Databricks SQL UI. This includes how to create, edit, and monitor existing SQL warehouses. You will also learn how to convert existing classic warehouses into serverless warehouses. You can also create SQL warehouses using the SQL warehouse API, or Terraform.
A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL. Compute resources are infrastructure resources that provide processing capabilities in the cloud.
To navigate to the SQL warehouse dashboard, click SQL Warehouses in the sidebar. By default, warehouses are sorted by state (running warehouses first), then in alphabetical order.
To help you get started, Databricks creates a small SQL warehouse called Starter Warehouse automatically. You can edit or delete this SQL warehouse.
SQL warehouses have the following requirements:
To create a SQL warehouse you must be a workspace admin or a user with unrestricted cluster creation permissions.
To manage a SQL warehouse you must be a workspace admin or have the Can Manage permission on the SQL warehouse.
Create warehouses using the SQL Warehouses page in the web UI, the SQL Warehouse API, or Terraform. The default warehouse settings create an efficient and high-performing SQL warehouse. You can edit some of the settings to fit your workload needs.
To create a SQL warehouse using the web UI:
Click SQL Warehouses in the sidebar.
Click Create SQL Warehouse.
Enter a Name for the warehouse.
Accept the default warehouse settings or edit them. See warehouse settings.
(Optional) Configure advanced options. See Advanced options.
You can then configure warehouse permissions if you’d like.
Your SQL warehouse is now created and started.
Creating a SQL warehouse in the UI allows you to update the following settings:
Cluster Size represents the number of clusters and size of compute resources available to run your queries and dashboards. The default is X-Large. To reduce query latency, increase the size. See Cluster size.
Auto Stop determines whether the warehouse stops if it’s idle for the specified number of minutes. The default is 45 minutes, which is recommended for typical use. The minimum is 10 minutes. Idle SQL warehouses continue to accumulate DBU and cloud instance charges until they are stopped.
Scaling sets the minimum and maximum number of clusters that will be used for a query. The default is a mininum and a maximum of one cluster. You can increase the maximum clusters if you want to handle more concurrent users for a given query. Databricks recommends a cluster for every 10 concurrent queries.
To maintain optimal performance, Databricks periodically recycles clusters. During a recycle period, you may temporarily see a cluster count that exceeds the maximum as Databricks transitions new workloads to the new cluster and waits to recycle the old cluster until all open workloads have completed.
Configure the following advanced options by expanding the Advanced options area when you create a new SQL warehouse or edit an existing SQL warehouse. You can also configure these options using the SQL Warehouse API.
Tags: Tags allow you to monitor the cost of cloud resources used by users and groups in your organization. You specify tags as key-value pairs.
Unity Catalog: If Unity Catalog is enabled for the workspace, it is the default for all new warehouses in the workspace. If Unity Catalog is not enabled for your workspace, you do not see this option. For more information about Unity Catalog, see Unity Catalog.
Spot instance policy: The spot instance policy determines whether workers use only on-demand instances or a combination of on-demand and spot instances. Cost Optimized (the default) uses mostly spot instances and one on-demand instance. Reliability Optimized uses only on-demand instances.
Channel: Use the Preview channel to test new functionality, including your queries and dashboards, before it becomes the Databricks SQL standard.
The release notes lists what’s in the latest preview version.
Databricks recommends against using a preview version for production workloads. Because only workspace admins can view a warehouse’s properties, including its channel, consider indicating that a Databricks SQL warehouse uses a preview version in that warehouse’s name to prevent users from using it for production workloads.
To manually start a stopped SQL warehouse, click SQL Warehouses in the sidebar then click the start icon next to the warehouse.
If a SQL warehouse is stopped and you attempt to run a job or query that uses it, Databricks starts the warehouse automatically. A warehouse also restarts automatically if you open a query in the SQL editor that is saved to a stopped warehouse or if you open a dashboard that is saved with a dashboard-level warehouse assigned to it.
Manage SQL warehouses using the web UI or the SQL Warehouse API.
To stop a running warehouse, click the stop icon next to the warehouse.
To start a stopped warehouse, click the start icon next to the warehouse.
To delete a warehouse, click the kebab menu , then click Delete. Note: Contact Support to restore warehouses deleted within 14 days.
To edit a warehouse, click the kebab menu then click Edit.
To add and edit permissions, click the kebab menu then click Permissions. To learn about permission levels, see SQL warehouse access control.
To monitor a SQL warehouse, click the name of a SQL warehouse and then the Monitoring tab. On the Monitoring tab, you see the following monitoring elements:
Live statistics: Live statistics show the currently running and queued queries, active SQL sessions, the warehouse status, and the current cluster count.
Time scale filter: The monitoring time scale filter sets the time range for the query count chart, running cluster chart, and the query history and event log table. The default time range is 8 hours, but you can specify 24 hours, 7 days, or 14 days. You can also click and drag on the bar chart to change the time range.
Query count chart: The query count chart shows the number of queries running or queued on the warehouse during the selected time frame.
Running clusters chart: The running clusters chart shows the number of clusters allocated to the warehouse during the selected time frame. During a cluster recycle, this count may temporarily exceed configured maximum.
Query history table: The query history table shows all of the queries active during the selected time frame, their start time and duration, and the user that executed the query. You can filter the queries by user, query duration, query status, and query type.
The cluster count can be greater than one only if scaling is enabled and configured.
For information on how classic and pro SQL warehouses are sized and how autoscaling works, see SQL warehouse sizing, scaling, and queuing behavior.