Create and manage dashboard datasets

This article explains how to create and manage dashboard datasets using the dataset editor in an AI/BI dashboard.

Define datasets

To define or access existing datasets, click the Data tab near the upper-left corner of your dashboard. You can define up to 100 datasets per dashboard.

You can define datasets as:

  • A new query against one or more tables or views.

  • An existing Unity Catalog table or view.

Note

All datasets are defined by a query. When you choose an existing Unity Catalog table or view, the query for that dataset is a SELECT * statement on that table or view. You can modify the query to refine the dataset.

After defining a dataset, use the Kebab menu kebab menu to the right of the dataset name to access the following options:

  • Rename: Give your dataset a descriptive name so you and your team can quickly identify the queries you want to edit or review.

  • Suggest name: Automatically generate a suggested name based on the query. You can edit this name after it’s generated.

  • Clone: Create an in-place copy of your query. You can edit the query after it is cloned.

  • Delete: Delete a dataset. This option is unavailable if the dataset is being used on the canvas.

  • Download: You can download the dataset as a CSV, TSV, or Excel file.

    Menu shows the dataset options

View the results table

When you create a dataset, the query runs automatically, and the results appear as a table in the pane below the editor. Each column includes an icon indicating its data type.

To sort column values:

  • Hover over each column to show the Sort icon sort icon.

  • Click the icon to sort the values in ascending or descending order.

View the query result schema

To view the schema of your defined dataset, click Schema to the right of Result Table. The Schema tab lists all fields from the defined dataset. Each field is labeled with an icon denoting the column’s datatype. If the underlying tables or views include comments, they appear in the Schema tab.

Write multi-statement queries

In some cases, you might want to construct a dataset using multiple SQL statements. To run multiple statements, end each statement with a semicolon (;). When you run those commands to create the dataset, the output shows the results of the last statement in the editor. This dataset is used for any related visualizations on the canvas.

Statements are local to the dataset in which they are issued. For example, if you create a dataset with a USE statement to set the catalog or schema, that setting applies only to that dataset.

Examples

The following examples demonstrate common uses for multi-statement queries.

Set the catalog and schema

You can write a USE statement to declare the catalog and schema for the table in your query. The following query contains three statements. The first sets the catalog name. The second sets the schema. The SELECT statement references only the table name because the catalog and schema have been set. See USE CATALOG.

USE CATALOG samples;
USE SCHEMA tpch;
SELECT * FROM customer;

Set ANSI mode

You can set a query to run with ANSI_MODE set to TRUE or FALSE. For Databricks SQL, the system default value for ANSI_MODE is TRUE. See ANSI_MODE.

The following query sets ANSI_MODE to FALSE so that invalid data type inputs return NULL instead of throwing an error.

SET ANSI_MODE = FALSE;
SELECT cast('a' AS INTEGER);

Parameterize complex queries

You can use multiple statements to parameterize the view name for a common table expression or other complex query.

The following query creates two temporary views. The SELECT statement uses the IDENTIFIER clause to interpret the input string as a table name. See IDENTIFIER clause.

CREATE TEMPORARY VIEW v1 AS SELECT 1;
CREATE TEMPORARY VIEW v2 AS SELECT 2;
SELECT * FROM IDENTIFIER(:view_name)

Set variables

The following query declares a variable and value in the first statement. The second statement changes the value of the variable. The third statement shows the end value for the variable is 5. See SET VARIABLE for details and complete syntax for using temporary variables.

DECLARE VARIABLE myvar1 INT DEFAULT 7;
SET VAR myvar1 = 5;
VALUES (myvar1);

Limit data access with SQL

Dashboard viewers can access all data in a dataset, even if it’s not shown in visualizations. To prevent sensitive data from being exposed in the browser, restrict the columns in the SQL query that defines the dataset. For example, instead of selecting all columns from a table, include only the specific fields required for visualizations.