Column mapping on Databricks

Preview

This feature is in Public Preview.

Databricks supports column mapping for Delta Lake tables, which allows table columns and the corresponding Parquet columns to use different names. Column mapping enables Delta schema evolution operations such as RENAME COLUMN on a Delta table without the need to rewrite the underlying Parquet files. It also allows users to name Delta table columns by using characters that are not allowed by Parquet, such as spaces, so that users can directly ingest CSV or JSON data into Delta without the need to rename columns due to previous character constraints.

Requirements

  • Databricks Runtime 10.2 or above.

  • Column mapping requires the Delta table version to be reader version 2 and writer version 5. For a Delta table with the required table version, you can enable column mapping by setting delta.columnMapping.mode to name. You can upgrade the table version and enable column mapping by using a single ALTER TABLE command:

    ALTER TABLE <table_name> SET TBLPROPERTIES (
      'delta.minReaderVersion' = '2',
      'delta.minWriterVersion' = '5',
      'delta.columnMapping.mode' = 'name'
    )
    

    Note

    After you set these properties in the table, you can only read from and write to this Delta table by using Databricks Runtime 10.2 and above.

Supported characters in column names

When column mapping is enabled for a Delta table, you can include spaces as well as any of these characters in the table’s column names: ,;{}()\n\t=.

Rename a column

Note

Available in Databricks Runtime 10.2 and above.

When column mapping is enabled for a Delta table, you can rename a column:

ALTER TABLE <table_name> RENAME COLUMN old_col_name TO new_col_name

For more examples, see Update Delta Lake table schema.

Drop columns

Note

Available in Databricks Runtime 11.0 and above.

When column mapping is enabled for a Delta table, you can drop one or more columns:

ALTER TABLE table_name DROP COLUMN col_name
ALTER TABLE table_name DROP COLUMNS (col_name_1, col_name_2, ...)

For more details, see Update Delta Lake table schema.