What is Databricks Assistant?

Preview

This feature is in Public Preview.

Databricks Assistant works as an AI-based companion pair-programmer and support agent to make you more efficient as you create notebooks, queries, and files. It can help you rapidly answer questions by generating, optimizing, completing, explaining, and fixing code and queries.

This page provides general information about the Assistant in the form of frequently asked questions. For questions about privacy and security, see Privacy and security.

Enable or disable Databricks Assistant

Databricks Assistant is enabled by default. You can manage enablement for all workspaces in an account or individual workspaces.

Enablement of the Databricks Assistant for your account is captured as an account event in your audit logs, see Account events.

Manage the account setting

To enable or disable all workspaces in an account for Databricks Assistant, follow these instructions:

  1. As an account admin, log in to the account console.

  2. Click Settings.

  3. Click the Advanced tab.

  4. From the Other > Partner-powered AI assistive features section, select Enabled or Disabled, and then click Save.

Manage the workspace setting

If the account setting permits workspace setting overrides, workspace admins can enable or disable specific workspaces. To do this, use a Workspace Setting to override the default setting in the Account Console as follows:

  1. Go to the workspace settings page.

  2. Under User, click the Developer tab.

  3. Under Experimental features > New Assistant move the toggle to Off. Disable Databricks Assistant toggle.

Get coding help from Databricks Assistant

To access Databricks Assistant, click the Assistant icon Databricks assistant icon in the left sidebar of the notebook, the file editor, or the SQL Editor.

Databricks assistant icon location

The Assistant pane can open on the left or right side of the screen.

Databricks assistant pane

Some capabilities of Databricks Assistant are the following:

  • Generate: Use natural language to generate a SQL query.

  • Explain: Highlight a query or a block of code and have Databricks Assistant walk through the logic in clear, concise English.

  • Fix: Explain and fix syntax and runtime errors with a single click.

  • Transform and optimize: Convert Pandas code to PySpark for faster execution.

Any code generated by the Databricks Assistant is intended to run in a Databricks compute environment. It is optimized to create code in Databricks-supported programming languages, frameworks, and dialects. It is not intended to be a general-purpose programming assistant. The Assistant often uses information from Databricks resources, such as the Databricks Documentation website or Knowledge Base, to better answer user queries. It performs best when the user question is related to questions that can be answered with knowledge from Databricks documentation, Unity Catalog, and user code in the Workspace.

Users should always review any code generated by the Assistant before running it because it can sometimes make mistakes.

Create data visualizations using the Databricks Assistant

You can use the Databricks Assistant when drafting dashboards. As you create visualizations on an existing dashboard dataset, prompt the Assistant with questions to receive responses in the form of generated charts. To use the Assistant in a dashboard, first create one or more datasets, then add a visualization widget to the Canvas. The visualization widget includes a prompt to describe your new chart. Type a description of the chart you want to see, and the assistant will generate it. You can approve or reject the chart, or modify the description to generate something new.

For details and examples of using the Assistant with dashboards, see Create visualizations with Databricks Assistant.

Services used by Databricks Assistant

Databricks Assistant might use third-party services to provide responses, including Azure OpenAI operated by Microsoft.

These services are subject to their respective data management policies. Data sent to these services is not used for any model training. For details, see Azure data management policy.

For Azure OpenAI, Databricks has opted out of Abuse Monitoring so no prompts or responses are stored with Azure OpenAI.

Tips for improving the accuracy of results

  • Use the prompt “Find Tables” for better responses. Before you ask questions about data in a table, ask the Assistant to find related tables by subject matter or other characteristics. Example: Find tables related to NFL games.

  • Specify the structure of the response you want. The structure and detail that Databricks Assistant provides varies, even for the same prompt. Databricks Assistant knows about your table and column schema and metadata, so you can use natural language to ask your question. Example: List active and retired NFL quarterbacks' passing completion rate, for those who had over 500 attempts in a season. Assistant answers using data from columns such as s.player_id and s.attempts.

  • Provide examples of your row-level data values. Databricks Assistant doesn’t have access to row-level data, thus for more accurate answers provide examples of the data. Example: List the average height for each position in inches. This returns an error because the data set shows height in feet and inches, as in 6-2.

  • Test code snippets by running them in the Assistant pane. Use the Assistant pane as a scratchpad that saves iterations of your queries and assistant answers. You can run code and edit it in the pane until you are ready to add it to a notebook.

    Testing code snippets by running them in the Assistant pane.
  • Use cell actions in a notebook. Cell actions include shortcuts to common tasks, such as documenting (commenting), fixing, and explaining code.

    `/doc` cell action prompts Assistant to comment the code.

For fully illustrated examples, see 5 tips for Databricks Assistant.

Databricks Assistant considers the history of the conversation so you can refine your questions as you go.

Give feedback

The best way to send feedback is to use the Provide Feedback links in the notebook and SQL editor. You can also send an email to assistant-feedback@databricks.com or to your account team.

Share product improvement suggestions and user experience issues rather than feedback about prompt accuracy. If you receive an unhelpful suggestion from the Assistant, click the “Not useful” Thumb down icon button.

Privacy and security

Q: What data is being sent to the models?

Databricks Assistant sends code and metadata to the models on each API request. This helps return more relevant results for your data. Examples include:

  • Code/queries in the current notebook cell or SQL Editor tab

  • Table and Column names and descriptions

  • Previous questions

  • Favorite tables

Q: Does the metadata sent to the models respect the user’s Unity Catalog permissions?

Yes, all of the data sent to the model respects the user’s Unity Catalog permissions, so it does not send metadata relating to tables that the user does not have permission to see.

Q: If I execute a query with results, and then ask a question, do the results of my query get sent to the model?

No, only the code contents in cells, metadata about tables, and the user-entered text is shared with the model. For the “fix error” feature, Databricks also shares the stack trace from the error output.

Q: Will Databricks Assistant execute dangerous code?

No. Databricks Assistant does not automatically run code on your behalf. AI models can make mistakes, misunderstand intent, and hallucinate or give incorrect answers. Review and test AI- generated code before you run it.

Q: Has Databricks done any assessment to evaluate the accuracy and appropriateness of the Assistant responses?

Yes. Databricks has mitigations to prevent the Assistant from generating harmful responses such as hate speech, insecure code, prompt jailbreaks, and third-party copyright content. Databricks has done extensive testing of all our AI assistive features with thousands of simulated user inputs to assess the robustness of mitigations. These assessments focused on the expected use cases for the Assistant such as code generation in the Python, Databricks SQL, R, and Scala languages.