Configure notebook result storage location

Your organization’s privacy requirements may require that you store all interactive notebook results in the GCS bucket for system data of your cloud account, rather than the Databricks-managed control plane default location where some notebook command results are stored.

Notebook command output is stored differently depending on how you run the notebook.

By default, when you run a notebook interactively by clicking Run in the notebook:

  • If the results are small, they are stored in the Databricks control plane, along with the notebook’s command contents and metadata.

  • Larger results are stored in the workspace’s GCS bucket for system data in your Google Cloud account. Databricks automatically creates the GCS bucket for system data. Databricks uses this storage area for workspace system data and your workspace’s DBFS root. Notebook results are stored in workspace system data storage, which is not accessible by users.

  • Plot images and other binary objects are always stored separately in the FileStore area of the DBFS root.

When you run a notebook as a job, by scheduling it or by clicking Run Now on the Jobs page, all results are stored in the workspace’s GCS bucket for system data in your account.

You can configure your workspace to store all interactive notebook results in your cloud account, regardless of result size.

Configure the storage location for interactive notebook results

You can configure your workspace to store all interactive notebook results in your Google Cloud account, rather than the control plane. You can enable this feature using the admin settings page or REST API. This configuration has no effect on notebooks run as jobs, whose results are already stored in your Google Cloud account by default.

Keep the following points in mind:

  • Changes to this configuration are effective only for new results. Existing notebook results are not moved.

  • Some metadata about the results, such as chart column names, continue to be stored in the control plane.

  • Increased storage costs may be incurred on your cloud provider.

  • Increased network and IO latency may occur when reading and writing results.

Store all notebook results in your account using the admin settings page

As a workspace administrator:

  1. Go to the admin settings page.

  2. Click the Security tab.

  3. Click the Store interactive notebook results in customer account toggle.

Store all notebook results in your account using the REST API

To configure your workspace to store all notebook results in your Google Cloud account using the REST API:

  • You must be a workspace administrator.

  • You need a personal access token. The instructions that follow assume that you have configured a .netrc file with your personal access token so that you can use the -n option in curl commands. See the article referenced above for details.

To get the current setting, call the GET /workspace-conf endpoint and set keys to storeInteractiveNotebookResultsInCustomerAccount:

curl -n --request GET \
  'https://<databricks-instance>/api/2.0/workspace-conf?keys=storeInteractiveNotebookResultsInCustomerAccount'

To enable your workspace to store interactive notebook results in your Google Cloud account, call the PATCH /workspace-conf endpoint and set storeInteractiveNotebookResultsInCustomerAccount to true in the request body:

curl -n --request PATCH \
 'https://<databricks-instance>/api/2.0/workspace-conf' \
 --header 'Content-Type: text/plain' \
 --data-raw '{
    "storeInteractiveNotebookResultsInCustomerAccount": "true"
}'

To disable the feature, set the same flag to false:

curl -n --request PATCH \
  'https://<databricks-instance>/api/2.0/workspace-conf' \
 --header 'Content-Type: text/plain' \
 --data-raw '{
    "storeInteractiveNotebookResultsInCustomerAccount": "false"
}'