Variant support in Delta Lake

Preview

This feature is in Public Preview.

You can use the VARIANT data type to store semi-structured data in Delta Lake. For examples on working with VARIANT, see Query variant data.

You must use Databricks Runtime 15.3 or above to read and write tables with variant support enabled.

Enable variant on a Delta table

To enable variant, create a new table with a VARIANT type column, for example:

CREATE TABLE table_name (variant_column VARIANT)

You can also enable support for VARIANT on an existing table using the following syntax:

ALTER TABLE table_name SET TBLPROPERTIES('delta.feature.variantType-preview' = 'supported')

Warning

When you enable variant, the table protocol is upgraded. After upgrading, the table will not be readable by Delta Lake clients that do not support variant. See How does Databricks manage Delta Lake feature compatibility?.

Limitations

The following limitations exist:

  • You cannot use variant columns to partition a table.

  • A variant column cannot be a clustering key for a table.

  • You cannot use variant columns with GROUP BY or ORDER BY clauses.

  • You cannot call DISTINCT on a variant column.

  • You cannot use SQL set operators (INTERSECT, UNION, EXCEPT) with variant columns.

  • You cannot use column generation to create a variant column.

  • Delta does not collect minValues or maxValues statistics for variant columns.