event_log
table-valued function
Applies to: Databricks SQL Databricks Runtime 13.3 LTS and above
Returns the event log for streaming tables and DLT pipelines.
Learn more about the Delta Live Tables event log.
Note
The event_log
table-valued function can be called only by the owner of a streaming table or materialized view, and a view created over the event_log
table-valued function can be queried only by the owner of a streaming table or materialized view. The view cannot be shared with other users.
Arguments
table_name: The name of a materialized view or streaming table. The name must not include a temporal specification. If the name is not qualified, the current catalog and schema are used to qualify the identifier.
pipeline_id
: The string identifier of a Delta Live Tables pipeline.
Returns
id STRING NOT NULL
: A unique identifier for the event log record.sequence STRING NOT NULL
: A JSON object containing metadata to identify and order events.origin STRING NOT NULL
: A JSON object containing metadata for the origin of the event, for example, cloud provider, region,user_id
, orpipeline_id
.timestamp TIMESTAMP NOT NULL
: The time the event was recorded in UTC.message STRING NOT NULL
: A human-readable message describing the event.level STRING NOT NULL
: The level of logging, for example,INFO
,WARN
,ERROR
, orMETRICS
.maturity_level STRING NOT NULL
: The stability of the event schema. The possible values are:STABLE
: The schema is stable and will not change.NULL
: The schema is stable and will not change. The value may beNULL
if the record was created before thematurity_level
field was added (release 2022.37).EVOLVING
: The schema is not stable and may change.DEPRECATED
: The schema is deprecated and the Delta Live Tables runtime may stop producing this event at any time.
error STRING
: If an error occurred, details describing the error.details STRING NOT NULL
: A JSON object containing structured details of the event. This is the primary field used for analyzing events.event_type STRING NOT NULL
: The event type.
Examples
For more examples, refer to Querying the event log.
-- View the events on a materialized view
> SELECT timestamp, message, details
FROM event_log(table(my_mv))
WHERE level in ('INFO', 'WARN', 'ERROR')
ORDER BY timestamp;
timestamp, message, details
---------------------------
2023-08-12 01:03:05.000, 'Flow "my_mv" is STARTING.', '{"flow_progress":{"status":"STARTING"}}'
-- Create a temp view with the latest update to the table/pipeline
> CREATE OR REPLACE TEMP VIEW latest_update AS
SELECT origin.update_id AS id FROM event_log('<pipeline-ID>')
WHERE event_type = 'create_update' ORDER BY timestamp DESC LIMIT 1;
-- Query lineage information
> SELECT
details:flow_definition.output_dataset as output_dataset,
details:flow_definition.input_datasets as input_dataset
FROM
event_log('<pipeline-ID>'),
latest_update
WHERE
event_type = 'flow_definition' AND origin.update_id = latest_update.id;
output_dataset, input_dataset
-----------------------------
customers, null
sales_orders_raw, null
sales_orders_cleaned, ["customers", "sales_orders_raw"]
sales_order_in_la, ["sales_orders_cleaned"]
-- Query data quality expectation history for a streaming table
> WITH expectations_parsed AS (
SELECT
explode(
from_json(
details:flow_progress.data_quality.expectations,
"array<struct<name: string, dataset: string, passed_records: int, failed_records: int>>"
)
) row_expectations
FROM
event_log(table(my_st)),
latest_update
WHERE
event_type = 'flow_progress'
AND origin.update_id = latest_update.id
)
SELECT
row_expectations.dataset as dataset,
row_expectations.name as expectation,
SUM(row_expectations.passed_records) as passing_records,
SUM(row_expectations.failed_records) as failing_records
FROM expectations_parsed
GROUP BY
row_expectations.dataset,
row_expectations.name;
dataset, expectation, passing_records, failing_records
------------------------------------------------------
sales_orders_cleaned, valid_order_number, 4083, 0