Jobs system table reference
Preview
This system table is in Public Preview. To access the table, the schema must be enabled in your system
catalog. For more information, see Enable system table schemas.
Note
The lakeflow
schema was previously known as workflow
. The content of both schemas is identical. To make the lakeflow
schema visible, you must enable it separately.
This article provides a reference for how to use the lakeflow
system tables to monitor jobs in your account. These tables include records from all workspaces in your account deployed in the same cloud region. To see records from another region, you need to view the tables from a workspace deployed in that region.
system.lakeflow
must be enabled by an account admin. You can enable it using the SystemSchemas API.
For examples of using these tables for job cost and health observability, see Monitor job costs with system tables.
Note
These tables are unavailable in the asia-south1
region.
Available job tables
All jobs-related system tables live in the system.lakeflow
schema. Currently, the schema hosts four tables:
jobs
: Tracks creation, deletion, and basic information of jobs.job_tasks
: Tracks creation, deletion, and basic information of job tasks.job_run_timeline
: Records the start, end, and resulting state of job runs.job_task_run_timeline
: Records the start, end, and resulting state of job tasks.
Job table schema
The jobs
table is a slowly changing dimension table. When a row changes, a new row is emitted, logically replacing the previous one.
Table path: This system table is located at system.lakeflow.jobs
.
Column name |
Data type |
Description |
---|---|---|
|
string |
The ID of the account this job belongs to. |
|
string |
The ID of the workspace this job belongs to. |
|
string |
The ID of the job. This key is only unique within a single workspace. |
|
string |
The user-supplied name of the job. |
|
string |
The user-supplied description of the job. Not populated for rows emitted before late August 2024. |
|
string |
The ID of the principal who created the job. |
|
string |
The user-supplied custom tags associated with this job. |
|
timestamp |
The time when the job was last modified. Timezone information is recorded at the end of the value with |
|
timestamp |
The time when the job was deleted by the user. Timezone information is recorded at the end of the value with |
|
string |
The ID of the user or service principal whose permissions are used for the job run. |
Job task table schema
The job tasks table is a slowly changing dimension table. When a row changes, a new row is emitted, logically replacing the previous one.
Table path: This system table is located at system.lakeflow.job_tasks
.
Column name |
Data type |
Description |
---|---|---|
|
string |
The ID of the account this job belongs to. |
|
string |
The ID of the workspace this job belongs to. |
|
string |
The ID of the job. This key is only unique within a single workspace. |
|
string |
The reference key for a task in a job. This key is only unique within a single job. |
|
array |
The task keys of all upstream dependencies of this task. |
|
timestamp |
The time when the task was last modified. Timezone information is recorded at the end of the value with |
|
timestamp |
The time when a task was deleted by the user. Timezone information is recorded at the end of the value with |
Job run timeline table schema
The job run timeline table is immutable and complete at the time it is produced.
Table path: This system table is located at system.lakeflow.job_run_timeline
.
Column name |
Data type |
Description |
---|---|---|
|
string |
The ID of the account this job belongs to. |
|
string |
The ID of the workspace this job belongs to. |
|
string |
The ID of the job. This key is only unique within a single workspace. |
|
string |
The ID of the job run. |
|
timestamp |
The start time for the run or for the time period. Timezone information is recorded at the end of the value with |
|
timestamp |
The end time for the run or for the time period. Timezone information is recorded at the end of the value with |
|
string |
The type of trigger that can fire a run. For possible values, see Trigger type values |
|
string |
The type of job run. For possible values, see Run type values. |
|
string |
The user-supplied run name assosiated with this job run. |
|
array |
Array containing the compute IDs for the parent job run. Use for identifying cluster used by |
|
string |
The outcome of the job run. For possible values, see Result state values. |
|
string |
The termination code of the job run. For possible values, see Termination code values. Not populated for rows emitted before late August 2024. |
|
map |
The job-level parameters used in the job run. Not populated for rows emitted before late August 2024. |
Trigger type values
The possible values for the trigger_type
column are:
CONTINUOUS
CRON
FILE_ARRIVAL
ONETIME
ONETIME_RETRY
Run type values
The possible values for the run_type
column are:
JOB_RUN
SUBMIT_RUN
: One-time run created via POST /api/2.1/jobs/runs/submit.WORKFLOW_RUN
: Job run initiated from the notebook workflow.
Result state values
The possible values for the result_state
column are:
SUCCEEDED
FAILED
SKIPPED
CANCELLED
TIMED_OUT
ERROR
BLOCKED
Termination code values
The possible values for the termination_code
column are:
SUCCESS
CANCELLED
SKIPPED
DRIVER_ERROR
CLUSTER_ERROR
REPOSITORY_CHECKOUT_FAILED
INVALID_CLUSTER_REQUEST
WORKSPACE_RUN_LIMIT_EXCEEDED
FEATURE_DISABLED
CLUSTER_REQUEST_LIMIT_EXCEEDED
STORAGE_ACCESS_ERROR
RUN_EXECUTION_ERROR
UNAUTHORIZED_ERROR
LIBRARY_INSTALLATION_ERROR
MAX_CONCURRENT_RUNS_EXCEEDED
MAX_SPARK_CONTEXTS_EXCEEDED
RESOURCE_NOT_FOUND
INVALID_RUN_CONFIGURATION
CLOUD_FAILURE
MAX_JOB_QUEUE_SIZE_EXCEEDED
Job task run timeline table schema
The job task run timeline table is immutable and complete at the time it is produced.
Table path: This system table is located at system.lakeflow.job_task_run_timeline
.
Column name |
Data type |
Description |
---|---|---|
|
string |
The ID of the account this job belongs to. |
|
string |
The ID of the workspace this job belongs to. |
|
string |
The ID of the job. This key is only unique within a single workspace. |
|
string |
The ID of the task run. |
|
string |
The ID of the job run. Not populated for rows emitted before late August 2024. |
|
string |
The ID of the parent run. Not populated for rows emitted before late August 2024. |
|
timestamp |
The start time for the task or for the time period. Timezone information is recorded at the end of the value with |
|
timestamp |
The end time for the task or for the time period. Timezone information is recorded at the end of the value with |
|
string |
The reference key for a task in a job. This key is only unique within a single job. |
|
array |
Array containing the IDs of compute used by the job task. |
|
string |
The outcome of the job task run. |
|
string |
The termination code of the task run. See the possible values below this table. Not populated for rows emitted before late August 2024. |
The possible values for the result_state
column are:
SUCCEEDED
FAILED
SKIPPED
CANCELLED
TIMED_OUT
ERROR
BLOCKED
The possible values for the termination_code
column are:
SUCCESS
CANCELLED
SKIPPED
DRIVER_ERROR
CLUSTER_ERROR
REPOSITORY_CHECKOUT_FAILED
INVALID_CLUSTER_REQUEST
WORKSPACE_RUN_LIMIT_EXCEEDED
FEATURE_DISABLED
CLUSTER_REQUEST_LIMIT_EXCEEDED
STORAGE_ACCESS_ERROR
RUN_EXECUTION_ERROR
UNAUTHORIZED_ERROR
LIBRARY_INSTALLATION_ERROR
MAX_CONCURRENT_RUNS_EXCEEDED
MAX_SPARK_CONTEXTS_EXCEEDED
RESOURCE_NOT_FOUND
INVALID_RUN_CONFIGURATION
CLOUD_FAILURE
MAX_JOB_QUEUE_SIZE_EXCEEDED
Example queries
This section includes sample queries you can use to get the most out of the lakeflow tables.
Get the most recent version of jobs
Because the jobs
and job_tasks
tables are slowly changing dimension tables, a new record is created every time a change is made. To get the most recent version of a job, you can order by the change_time
column.
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM
system.lakeflow.jobs QUALIFY rn=1
Daily job count by workspace
This query gets the daily job count by workspace for the last 7 days:
SELECT
workspace_id,
COUNT(DISTINCT run_id) as job_count,
to_date(period_start_time) as date
FROM system.lakeflow.job_run_timeline
WHERE
period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
GROUP BY ALL
Daily job status distribution by workspace
This query returns the daily job count by workspace for the last 7 days, distributed by the outcome of the job run. The query removes any records where the jobs are in a pending or running state.
SELECT
workspace_id,
COUNT(DISTINCT run_id) as job_count,
result_state,
to_date(period_start_time) as date
FROM system.lakeflow.job_run_timeline
WHERE
period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
AND result_state IS NOT NULL
GROUP BY ALL
Longest running jobs overview
This query returns the average time of job runs, measured in seconds. The records are organized by job. A top 90 and a 95 percentile column show the average lengths of the job’s longest runs.
with job_run_duration as (
SELECT
workspace_id,
job_id,
run_id,
CAST(SUM(period_end_time - period_start_time) AS LONG) as duration
FROM
system.lakeflow.job_run_timeline
WHERE
period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
GROUP BY ALL
),
most_recent_jobs as (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM
system.lakeflow.jobs QUALIFY rn=1
)
SELECT
t1.workspace_id,
t1.job_id,
first(t2.name, TRUE) as name,
COUNT(DISTINCT t1.run_id) as runs,
MEAN(t1.duration) as mean_seconds,
AVG(t1.duration) as avg_seconds,
PERCENTILE(t1.duration, 0.9) as p90_seconds,
PERCENTILE(t1.duration, 0.95) as p95_seconds
FROM
job_run_duration t1
LEFT OUTER JOIN most_recent_jobs t2 USING (workspace_id, job_id)
GROUP BY ALL
ORDER BY mean_seconds DESC
LIMIT 100
Job run time for jobs executed via runSubmit (ie. Airflow)
This query provides a historical runtime for a specific job based on the run_name
parameter. For the query to work, you must set the run_name
.
You can also edit the length of time for analysis by updating the number of days in the INTERVAL 60 DAYS
section.
SELECT
workspace_id,
run_id,
SUM(period_end_time - period_start_time) as run_time
FROM system.lakeflow.job_run_timeline
WHERE
run_type="SUBMIT_RUN"
AND run_name={run_name}
AND period_start_time > CURRENT_TIMESTAMP() - INTERVAL 60 DAYS
GROUP BY ALL
Job run analysis
This query provides a historical runtime for a specific job. For the query to work, you must set a workspace_id
and job_id
.
You can also edit the length of time for analysis by updating the number of days in the INTERVAL 60 DAYS
section.
with job_run_duration as (
SELECT
workspace_id,
job_id,
run_id,
min(period_start_time) as run_start,
max(period_start_time) as run_end,
CAST(SUM(period_end_time - period_start_time) AS LONG) as duration,
FIRST(result_state, TRUE) as result_state
FROM
system.lakeflow.job_run_timeline
WHERE
period_start_time > CURRENT_TIMESTAMP() - INTERVAL 60 DAYS
AND workspace_id={workspace_id}
AND job_id={job_id}
GROUP BY ALL
ORDER BY run_start DESC
),
most_recent_jobs as (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM
system.lakeflow.jobs QUALIFY rn=1
)
SELECT
t1.workspace_id,
t1.job_id,
t2.name,
t1.run_id,
t1.run_start,
t1.run_end,
t1.duration,
t1.result_state
FROM job_run_duration t1
LEFT OUTER JOIN most_recent_jobs t2 USING (workspace_id, job_id)
Jobs running on all-purpose compute
This query joins with the compute.clusters
system table to return recent jobs that are running on all-purpose compute instead of jobs compute.
with clusters AS (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY workspace_id, cluster_id ORDER BY change_time DESC) as rn
FROM system.compute.clusters
WHERE cluster_source="UI" OR cluster_source="API"
QUALIFY rn=1
),
job_tasks_exploded AS (
SELECT
workspace_id,
job_id,
EXPLODE(compute_ids) as cluster_id
FROM system.lakeflow.job_task_run_timeline
WHERE period_start_time >= CURRENT_DATE() - INTERVAL 30 DAY
),
all_purpose_cluster_jobs AS (
SELECT
t1.*,
t2.cluster_name,
t2.owned_by,
t2.dbr_version
FROM job_tasks_exploded t1
INNER JOIN clusters t2 USING (workspace_id, cluster_id)
)
SELECT * FROM all_purpose_cluster_jobs LIMIT 10;
Retried job runs
This query collects a list of retried job runs with the number of retries for each run.
with repaired_runs as (
SELECT
workspace_id, job_id, run_id, COUNT(*) - 1 as retries_count
FROM system.lakeflow.job_run_timeline
WHERE result_state IS NOT NULL
GROUP BY ALL
HAVING retries_count > 0
)
SELECT
*
FROM repaired_runs
ORDER BY retries_count DESC
LIMIT 10;