REORG TABLE

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.0 and above

Reorganize a Delta Lake table by rewriting files to purge soft-deleted data, such as the column data dropped by ALTER TABLE DROP COLUMN.

Syntax

REORG TABLE table_name { [ WHERE predicate ] APPLY ( PURGE ) |
                         APPLY ( UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ) } )

Note

  • APPLY (PURGE) only rewrites files that contain soft-deleted data.

  • APPLY (UPGRADE) may rewrite all files.

  • REORG TABLE is idempotent, meaning that if it is run twice on the same dataset, the second run has no effect.

  • After running APPLY (PURGE), the soft-deleted data may still exist in the old files. You can run VACUUM to physically delete the old files.

Parameters

  • table_name

    Identifies an existing Delta table. The name must not include a temporal specification.

  • WHERE predicate

    For APPLY (PURGE), reorganizes the files that match the given partition predicate. Only filters involving partition key attributes are supported.

  • APPLY (PURGE)

    Specifies that the purpose of file rewriting is to purge soft-deleted data. See Purge metadata-only deletes to force data rewrite.

  • APPLY (UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ))

    Applies to: check marked yes Databricks Runtime 14.3 and above

    Specifies that the purpose of file rewriting is to upgrade the table to the given Iceberg version. version must be either 1 or 2.

Examples

> REORG TABLE events APPLY (PURGE);

> REORG TABLE events WHERE date >= '2022-01-01' APPLY (PURGE);

> REORG TABLE events
    WHERE date >= current_timestamp() - INTERVAL '1' DAY
    APPLY (PURGE);

> REORG TABLE events APPLY (UPGRADE UNIFORM(ICEBERG_COMPAT_VERSION=2));