Warehouse events system table reference
Preview
This system table is in Public Preview. To access the table, the schema must be enabled in your system
catalog. For more information, see Enable system table schemas.
In this article, you learn how to use the warehouse events system table to monitor and manage the SQL warehouses in your workspaces. This table records a row for every time a warehouse starts, stops, runs, and scales up and down. You can use the sample queries in this article with alerts to keep you informed of changes to your warehouses.
The warehouse events system table is located at system.compute.warehouse_events
.
Logged warehouse event types
This system table logs the following types of events:
SCALED_UP
: A new cluster was added to the warehouse.SCALED_DOWN
: A cluster was removed from the warehouse.STOPPING
: The warehouse is in the process of stopping.RUNNING
: The warehouse is actively running.STARTING
: The warehouse is in the process of starting up.STOPPED
: The warehouse has completely stopped running.
Warehouse events schema
The warehouse_events
system table uses the following schema:
Column name |
Data type |
Description |
Example |
---|---|---|---|
|
string |
The ID of the Databricks account. |
|
|
string |
The ID of the workspace where the warehouse is deployed. |
|
|
string |
The ID of SQL warehouse the event is related to. |
|
|
string |
The type of warehouse event. Possible values are |
|
|
integer |
The number of clusters that are actively running. |
|
|
timestamp |
Timestamp of when the event took place in UTC. |
|
Sample queries
The following sample queries are templates. Plug in whatever values make sense for your organization. You can also add alerts to these queries to help you stay informed about changes to your warehouses. See Create an alert.
Use the following sample queries to gain insight into warehouse behavior:
Which warehouses are actively running and for how long?
This query identifies which warehouses are currently active along with their running time in hours.
USE CATALOG `system`;
SELECT
we.warehouse_id,
we.event_time,
TIMESTAMPDIFF(MINUTE, we.event_time, CURRENT_TIMESTAMP()) / 60.0 AS running_hours,
we.cluster_count
FROM
compute.warehouse_events we
WHERE
we.event_type = 'RUNNING'
AND NOT EXISTS (
SELECT 1
FROM compute.warehouse_events we2
WHERE we2.warehouse_id = we.warehouse_id
AND we2.event_time > we.event_time
)
Alert opportunity: As a workspace admin you might want to be alerted if a warehouse is running longer than expected. For example, you can use the query results to set an alert condition to trigger when the running hours exceed a certain threshold.
Identify warehouses that are upscaled longer than expected
This query identifies which warehouses are currently active along with their running time in hours.
use catalog `system`;
SELECT
we.warehouse_id,
we.event_time,
TIMESTAMPDIFF(MINUTE, we.event_time, CURRENT_TIMESTAMP()) / 60.0 AS upscaled_hours,
we.cluster_count
FROM
compute.warehouse_events we
WHERE
we.event_type = 'SCALED_UP'
AND we.cluster_count >= 2
AND NOT EXISTS (
SELECT 1
FROM compute.warehouse_events we2
WHERE we2.warehouse_id = we.warehouse_id
AND (
(we2.event_type = 'SCALED_DOWN') OR
(we2.event_type = 'SCALED_UP' AND we2.cluster_count < 2)
)
AND we2.event_time > we.event_time
)
Alert opportunity: Alerting on this condition can help you monitor resources and cost. You could set an alert for when the upscaled hours exceed a certain limit.
Warehouses that start for the first time
This query informs you about new warehouses that are starting for the first time.
use catalog `system`;
SELECT
we.warehouse_id,
we.event_time,
we.cluster_count
FROM
compute.warehouse_events we
WHERE
(we.event_type = 'STARTING' OR we.event_type = 'RUNNING')
AND NOT EXISTS (
SELECT 1
FROM compute.warehouse_events we2
WHERE we2.warehouse_id = we.warehouse_id
AND we2.event_time < we.event_time
)
Alert opportunity: Alerting on new warehouses can help your organization track resource allocation. For example, you could set an alert that’s triggered every time a new warehouse starts.
Investigate billing charges
If you want to understand specifically what a warehouse was doing to generate billing charges, this query can tell you the exact dates and times the warehouse scaled up or down, or started and stopped.
use catalog `system`;
SELECT
we.warehouse_id AS warehouse_id,
we.event_type AS event,
we.event_time AS event_time,
we.cluster_count AS cluster_count
FROM
compute.warehouse_events AS we
WHERE
we.event_type IN (
'STARTING', 'RUNNING', 'STOPPING', 'STOPPED',
'SCALING_UP', 'SCALED_UP', 'SCALING_DOWN', 'SCALED_DOWN'
)
AND MONTH(we.event_time) = 7
AND YEAR(we.event_time) = YEAR(CURRENT_DATE())
AND we.warehouse_id = '19c9d68652189278'
ORDER BY
event_time DESC
Which warehouses haven’t been used in the last 30 days?
This query helps you identify unused resources, providing an opportunity for cost optimization.
use catalog `system`;
SELECT
we.warehouse_id,
we.event_time,
we.event_type,
we.cluster_count
FROM
compute.warehouse_events AS we
WHERE
we.warehouse_id IN (
SELECT DISTINCT
warehouse_id
FROM
compute.warehouse_events
WHERE
MONTH(event_time) = 6
AND YEAR(event_time) = YEAR(CURRENT_DATE())
)
AND we.warehouse_id NOT IN (
SELECT DISTINCT
warehouse_id
FROM
compute.warehouse_events
WHERE
MONTH(event_time) = 7
AND YEAR(event_time) = YEAR(CURRENT_DATE())
)
ORDER BY
event_time DESC
Alert opportunity: Receiving an alert on unused resources could help your organization optimize costs. For example, you could set an alert that’s triggered when the query detects an unused warehouse.
Warehouses with the most uptime in a month
This query shows which warehouses have been used the most during a specific month. This query uses July as an example.
use catalog `system`;
SELECT
warehouse_id,
SUM(TIMESTAMPDIFF(MINUTE, start_time, end_time)) / 60.0 AS uptime_hours
FROM (
SELECT
starting.warehouse_id,
starting.event_time AS start_time,
(
SELECT
MIN(stopping.event_time)
FROM
compute.warehouse_events AS stopping
WHERE
stopping.warehouse_id = starting.warehouse_id
AND stopping.event_type = 'STOPPED'
AND stopping.event_time > starting.event_time
) AS end_time
FROM
compute.warehouse_events AS starting
WHERE
starting.event_type = 'STARTING'
AND MONTH(starting.event_time) = 7
AND YEAR(starting.event_time) = YEAR(CURRENT_DATE())
) AS warehouse_uptime
WHERE
end_time IS NOT NULL
GROUP BY
warehouse_id
ORDER BY
uptime_hours DESC
Alert opportunity: You might want to keep track of high-utilization warehouses. For example, you could set an alert that’s triggered when the uptime hours for a warehouse exceed a specific threshold.
Warehouses that spent the most time upscaled during a month
This query informs you about warehouses that have spent significant time in the upscaled state during a month. This query uses July as an example.
use catalog `system`;
SELECT
warehouse_id,
SUM(TIMESTAMPDIFF(MINUTE, upscaled_time, downscaled_time)) / 60.0 AS upscaled_hours
FROM (
SELECT
upscaled.warehouse_id,
upscaled.event_time AS upscaled_time,
(
SELECT
MIN(downscaled.event_time)
FROM
compute.warehouse_events AS downscaled
WHERE
downscaled.warehouse_id = upscaled.warehouse_id
AND (downscaled.event_type = 'SCALED_DOWN' OR downscaled.event_type = 'STOPPED')
AND downscaled.event_time > upscaled.event_time
) AS downscaled_time
FROM
compute.warehouse_events AS upscaled
WHERE
upscaled.event_type = 'SCALED_UP'
AND upscaled.cluster_count >= 2
AND MONTH(upscaled.event_time) = 7
AND YEAR(upscaled.event_time) = YEAR(CURRENT_DATE())
) AS warehouse_upscaled
WHERE
downscaled_time IS NOT NULL
GROUP BY
warehouse_id
ORDER BY
upscaled_hours DESC
Alert opportunity: You might want to keep track of high-utilization warehouses. For example, you could set an alert that’s triggered when the uptime hours for a warehouse exceed a specific threshold.