Add AI-generated comments to Unity Catalog objects

This article introduces AI-generated Unity Catalog object and table column comments (also known as AI-generated documentation), explains how they work, shows how to add and edit them, and answers frequently asked questions.

Important

Saving comments triggers an ALTER SQL command, which can disrupt Databricks pipelines and jobs.

Supported objects

AI-generated comments are supported for the following Unity Catalog objects:

AI-generated comments do not support views or materialized views.

How do AI-generated comments work?

As an object owner or a user with permission to modify an object, you can use Catalog Explorer to view and add an AI-generated comment for objects and table columns managed by Unity Catalog. Comments are powered by a large language model (LLM) that takes into account object metadata, such as the table schema and column names.

AI-generated comments provide a quick way to help users discover data managed by Unity Catalog.

Important

AI-generated comments are intended to provide a general description of objects and table columns based on the schema. The descriptions are tuned for data in a business and enterprise context, using example schemas from several open datasets across various industries. The model was evaluated with hundreds of simulated samples to verify it avoids generating harmful or inappropriate descriptions.

AI models are not always accurate and comments must be reviewed prior to saving. Databricks strongly recommends human review of AI-generated comments to check for inaccuracies. The model should not be relied on for data classification tasks such as detecting columns with PII.

To view comments once they are added, you must have the SELECT privilege on the object, USE SCHEMA on the parent schema, and USE CATALOG on the parent catalog.

For information about the models that are used to generate comment suggestions, see Frequently asked questions about AI-generated comments.

Before you begin

Before you can use AI-generated comments, your workspace must be enabled for AI assistive features. This is enabled by default. If it’s not enabled, a workspace admin must do the following:

In Settings, go to the Advanced tab and scroll down to the Other section.
Turn on the Partner-powered AI assistive features option.

Add AI-generated comments

You must use Catalog Explorer to view suggested comments, edit them, and add them to objects and table columns.

Permissions required: You must be the object owner or have the MODIFY privilege on the object to view the AI-suggested comment, edit it, and add it.

Add an AI-suggested comment to an object

In your Databricks workspace, click Catalog.
Search or browse for the object and select it.
In the About this <object> panel, click AI generate.

The AI might take a moment to generate the comment.
Click Accept to accept the comment as-is, or Edit to modify it before you save it.

Add an AI-suggested comment to a table column

In your Databricks workspace, click Catalog.
Search or browse for the table and select it.
Above the table column headings, click AI generate.

A comment is generated for each column.
Click the check mark next to the column comment to accept it or close it unsaved.

Update an AI-generated comment

The object owner or user with the MODIFY privilege on the object can update comments at any time, using the Catalog Explorer UI. The inline chat assistant helps edit comments, providing options to Shorten text or Translate text to a different language.

You can also use ALTER or COMMENT ON SQL commands.

Frequently asked questions about AI-generated comments

This section provides general information about AI-generated comments (also known as AI-generated documentation) in the form of frequently asked questions.

What services does the AI-generated documentation feature use?

In workspaces enabled for HIPAA compliance, AI-generated comments might use external model partners to provide responses.

For all other workspaces on GCP, AI-generated comments use an internal large language model (LLM) for tables and columns. They might use external model partners for other Unity Catalog objects and the inline assistant.

Whether the model is internal or external, data sent to these models is not used for model training. The models themselves are stateless: no prompts or completions are stored by model providers.

What regions are model-serving endpoints hosted in?

European Union (EU) data stays in the EU. For external partner models, European Union (EU) workspaces use an external model hosted in the EU. All other regions use an external model hosted in the US. For internal Databricks models, European Union (EU) workspaces use a model hosted in eu-west-1. All other traffic is sent to the us-west-2 region during the Public Preview.

How is data encrypted between Databricks and external model partners?

Traffic between Databricks and external model partners is encrypted in transit using industry standard TLS 1.2 encryption.

Is everything encrypted at rest?

Any data stored within a Databricks workspace is AES-256 bit encrypted. Our external partners do not store any prompts or completions sent to them.

What data is sent to the models?

Databricks sends the following metadata to the models with each API request:

Catalog (catalog name, current comment, catalog type)
Schema (catalog name, schema name, current comment)
Table (catalog name, schema name, table name, current comment)
Function (catalog name, schema name, function name, current comment, parameters, definition)
Model (catalog name, schema name, model name, current comment, aliases)
Volume (catalog name, schema name, volume name, current comment)
Column names (column name, type, primary key or not, current column comment)

Approved comments are stored in the Databricks control plane database, along with the rest of the Unity Catalog metadata. The control plane database is AES-256 bit encrypted.

What legal terms govern the use of AI-generated comments?

Usage is governed by the existing Databricks terms and conditions the customer has agreed to when using Databricks.