Delta table properties reference
Delta Lake reserves Delta table properties starting with delta.
. These properties may have specific meanings, and affect behaviors when these properties are set.
Note
All operations that set or update table properties conflict with other concurrent write operations, causing them to fail. Databricks recommends you modify a table property only when there are no concurrent write operations on the table.
How do table properties and SparkSession properties interact?
Delta table properties are set per table. If a property is set on a table, then this is the setting that is followed by default.
Some table properties have associated SparkSession configurations which always take precedence over table properties. Some examples include the spark.databricks.delta.autoCompact.enabled
and spark.databricks.delta.optimizeWrite.enabled
configurations, which turn on auto compaction and optimized writes at the SparkSession level rather than the table level. Databricks recommends using table-scoped configurations for most workloads.
For every Delta table property you can set a default value for new tables using a SparkSession configuration, overriding the built-in default. This setting only affects new tables and does not override or replace properties set on existing tables. The prefix used in the SparkSession is different from the configurations used in the table properties, as shown in the following table:
Delta Lake conf |
SparkSession conf |
---|---|
|
|
For example, to set the delta.appendOnly = true
property for all new Delta Lake tables created in a session, set the following:
SET spark.databricks.delta.properties.defaults.appendOnly = true
To modify table properties of existing tables, use SET TBLPROPERTIES.
Delta table properties
Available Delta table properties include the following:
Property |
---|
See Delta table properties reference. Data type: Default: |
See Auto compaction for Delta Lake on Databricks. Data type: Default: (none) |
See Optimized writes for Delta Lake on Databricks. Data type: Default: (none) |
See Manage column-level statistics in checkpoints. Data type: Default: |
See Manage column-level statistics in checkpoints. Data type: Default: (none) |
See Compatibility for tables with liquid clustering. Data type: Default: |
Whether column mapping is enabled for Delta table columns and the corresponding Parquet columns that use different names. See Rename and drop columns with Delta Lake column mapping. Note: Enabling Data type: Default: |
The number of columns for Delta Lake to collect statistics
about for data skipping. A value of See Data skipping for Delta Lake. Data type: Default: |
A comma-separated list of column names on which Delta Lake collects statistics to
enhance data skipping functionality.
This property takes precedence over See Data skipping for Delta Lake. Data type: Default: (none) |
The shortest duration for Delta Lake to keep logically deleted data files before deleting them physically. This is to prevent failures in stale readers after compactions or partition overwrites. This value should be large enough to ensure that:
See Configure data retention for time travel queries. Data type: Default: |
Data type: Default: |
See What are deletion vectors?. Data type: |
The degree to which a transaction must be isolated from modifications made by concurrent transactions. Valid values are See Isolation levels and write conflicts on Databricks. Data type: Default: |
How long the history for a Delta table is kept. Each time a checkpoint is written, Delta Lake automatically cleans up log entries older than the retention interval. If you set this property to a large enough value, many log entries are retained. This should not impact performance as operations against the log are constant time. Operations on history are parallel but will become more expensive as the log size increases. See Configure data retention for time travel queries. Data type: Default: |
The minimum required protocol reader version for a reader that allows to read from this Delta table. Databricks recommends against manually configuring this property. See How does Databricks manage Delta Lake feature compatibility?. Data type: Default: |
The minimum required protocol writer version for a writer that allows to write to this Delta table. Databricks recommends against manually configuring this property. See How does Databricks manage Delta Lake feature compatibility?. Data type: Default: |
Data type: Default: |
When Data type: Default: |
The shortest duration within which new snapshots will retain transaction identifiers
(for example, Data type: Default: (none) |
The target file size in bytes or higher units for file tuning. For example,
See Configure Delta Lake to control data file size. Data type: Default: (none) |
See Configure Delta Lake to control data file size. Data type: Default: (none) |