Constraints on Databricks
Databricks supports standard SQL constraint management clauses. These clauses ensure the quality and integrity of data added to a table is automatically verified.
When a constraint is violated, the transaction fails with an error. Two types of constraints are supported:
NOT NULL
: indicates that values in specific columns cannot be null.CHECK
: indicates that a specified boolean expression must be true for each input row.
Important
Adding a constraint automatically upgrades the table writer protocol version if the previous writer version was less than 3. See Table protocol versioning to understand table protocol versioning and what it means to upgrade the protocol version.
All constraints on Databricks require Delta Lake.
Delta Live Tables has a similar concept known as expectations. See Manage data quality with Delta Live Tables.
Set a NOT NULL
constraint in Databricks
Note
SET NOT NULL
is available in Databricks Runtime 7.4 and above.
You specify NOT NULL
constraints in the schema when you create a table. You drop or add NOT NULL
constraints using the ALTER TABLE ALTER COLUMN
command.
CREATE TABLE people10m (
id INT NOT NULL,
firstName STRING,
middleName STRING NOT NULL,
lastName STRING,
gender STRING,
birthDate TIMESTAMP,
ssn STRING,
salary INT
) USING DELTA;
ALTER TABLE people10m ALTER COLUMN middleName DROP NOT NULL;
ALTER TABLE people10m ALTER COLUMN ssn SET NOT NULL;
Before adding a NOT NULL
constraint to a table, Databricks verifies that all existing rows satisfy the constraint.
If you specify a NOT NULL
constraint on a column nested within a struct, the parent struct must also be not null. Columns nested within array or map types do not accept NOT NULL
constraints.
Set a CHECK
constraint in Databricks
Note
Available in Databricks Runtime 7.4 and above.
In Databricks Runtime 7.3 LTS you can write to tables with
CHECK
constraints defined but you cannot createCHECK
constraints.
You manage CHECK
constraints using the ALTER TABLE ADD CONSTRAINT
and ALTER TABLE DROP CONSTRAINT
commands. ALTER TABLE ADD CONSTRAINT
verifies that all existing rows satisfy the constraint before adding it to the table.
CREATE TABLE people10m (
id INT,
firstName STRING,
middleName STRING,
lastName STRING,
gender STRING,
birthDate TIMESTAMP,
ssn STRING,
salary INT
) USING DELTA;
ALTER TABLE people10m ADD CONSTRAINT dateWithinRange CHECK (birthDate > '1900-01-01');
ALTER TABLE people10m DROP CONSTRAINT dateWithinRange;
See ALTER TABLE ADD CONSTRAINT and ALTER TABLE DROP CONSTRAINT.
CHECK
constraints are exposed as table properties in the output of the DESCRIBE DETAIL
and SHOW TBLPROPERTIES
commands.
ALTER TABLE people10m ADD CONSTRAINT validIds CHECK (id > 1 and id < 99999999);
DESCRIBE DETAIL people10m;
SHOW TBLPROPERTIES people10m;