Add AI-generated comments to a table

Preview

This feature is in Public Preview.

As a table owner or user with permission to modify a table, you can use Catalog Explorer to view and add an AI-generated comment for any table or table column managed by Unity Catalog. Comments are powered by a large language model (LLM) that takes into account the table metadata, such as the table schema and column names.

How do AI-generated comments work?

AI-generated comments (also known as AI-generated documentation) provide a quick way to help users discover data managed by Unity Catalog.

Important

AI-generated comments are intended to provide a general description of tables and columns based on the schema. The descriptions are tuned for data in a business and enterprise context, using example schemas from several open datasets across various industries. The model was evaluated with hundreds of simulated samples to verify it avoids generating harmful or inappropriate descriptions.

AI models are not always accurate and comments must be reviewed prior to saving. Databricks strongly recommends human review of AI-generated comments to check for inaccuracies. The model should not be relied on for data classification tasks such as detecting columns with PII.

Users with the USE SCHEMA and SELECT privileges on the table can view comments once they are added.

For information about the models that are used to generate comment suggestions, see Frequently asked questions about AI-generated table comments.

Add AI-generated comments

You must use Catalog Explorer to view suggested comments, edit them, and add them to tables and columns.

Permissions required: You must be the table owner or have the MODIFY privilege on the table to view the AI-suggested comment, edit it, and add it.

To add an AI-generated comment to a table:

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. Search or browse for the table and select it.

  3. View the AI Suggested Comment field below the Tags field.

    AI-generated comment edit field

    The AI might take a moment to generate the comment.

  4. Click Accept to accept the comment as-is, or Edit to modify it before you save it.

To add an AI-generated comment to a column:

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. Search or browse for the table and select it.

  3. On the Columns tab, click the AI generate button.

    A comment is generated for each column.

  4. Click the check mark next to the column comment to accept it or close it unsaved.

The table owner or user with the MODIFY privilege on the table can update table and column comments at any time, using the Catalog Explorer UI or SQL commands (ALTER TABLE or COMMENT ON).

Frequently asked questions about AI-generated table comments

This section provides general information about AI-generated table comments (also know as AI-generated documentation) in the form of frequently asked questions.

What services does the AI-generated documentation feature use?

In workspaces enabled for HIPAA compliance, AI-generated comments may use external model partners to provide responses.

For all other workspaces on GCP, AI-generated comments use an internal large language model (LLM).

Whether the model is internal or external, data sent to these models is not used for model training. The models themselves are stateless: no prompts or completions are stored by model providers.

What regions are model-serving endpoints hosted in?

European Union (EU) data stays in the EU. For external partner models, European Union (EU) workspaces use an external model hosted in the EU. All other regions use an external model hosted in the US. For internal Databricks models, European Union (EU) workspaces use a model hosted in eu-west-1. All other traffic is sent to the us-west-2 region during the Public Preview.

How is data encrypted between Databricks and external model partners?

Traffic between Databricks and external model partners is encrypted in transit using industry standard TLS 1.2 encryption.

Is everything encrypted at rest?

Any data stored within a Databricks workspace is AES-256 bit encrypted. Our external partners do not store any prompts or completions sent to them.

What data is sent to the models?

Databricks sends the following metadata to the models with each API request:

  • Table schema (catalog name, schema name, table name, current comment)

  • Column names (column name, type, primary key or not, current column comment)

Approved table or column comments are stored in the Databricks control plane database, along with the rest of the Unity Catalog metadata. The control plane database is AES-256 bit encrypted.