Schema enforcement
Databricks validates data quality by enforcing schema on write.
Note
This article describes the default behavior for tables on Databricks, which are backed by Delta Lake. Schema enforcement does not apply to tables backed by external data.
Schema enforcement for insert operations
Databricks enforces the following rules when inserting data into a table:
All inserted columns must exist in the target table.
All column data types must match the column data types in the target table.
Note
Databricks attempts to safely cast column data types to match the target table.
Schema validation during MERGE
operations
Databricks enforces the following rules when inserting or updating data as part of a MERGE
operation:
If the data type in the source statement does not match the target column,
MERGE
tries to safely cast column data types to match the target table.The columns that are the target of an
UPDATE
orINSERT
action must exist in the target table.When using
INSERT *
orUPDATE SET *
syntax:Columns in the source dataset not present in the target table are ignored.
The source dataset must have all the columns present in the target table.
Modify a table schema
You can update the schema of a table using either explicit ALTER TABLE
statements or automatic schema evolution. See Update Delta Lake table schema.
Schema evolution has special semantics for MERGE
operations. See Automatic schema evolution for Delta Lake merge.