Cluster libraries

Cluster libraries can be used by all notebooks and jobs running on a cluster. This article details using the Install library UI in the Databricks workspace.

You can install libraries to a cluster using the following approaches:

Install a library on a cluster

To install a library on a cluster:

  1. Click compute icon Compute in the sidebar.

  2. Click a cluster name.

  3. Click the Libraries tab.

  4. Click Install New.

  5. The Install library dialog displays.

  6. Select one of the Library Source options, complete the instructions that appear, and then click Install.

Important

Libraries can be installed from DBFS when using Databricks Runtime 14.3 LTS and below. However, any workspace user can modify library files stored in DBFS. To improve the security of libraries in a Databricks workspace, storing library files in the DBFS root is deprecated and disabled by default in Databricks Runtime 15.0 and above. See Storing libraries in DBFS root is deprecated and disabled by default.

Instead, Databricks recommends uploading libraries to workspace files or Unity Catalog volumes, or using library package repositories. If your workload does not support these patterns, you can also use libraries stored in cloud object storage.

Not all cluster access modes support all library configurations. See Cluster-scoped libraries.

Library source

Instructions

Workspace

Select a workspace file or upload a Whl, zipped wheelhouse, JAR, ZIP, tar, or requirements.txt file. See Install libraries from workspace files

Volumes

Select a Whl, JAR, or requirements.txt file from a volume. See Install libraries from a volume.

File Path/GCS

Select the library type and provide the full URI to the library object (for example: /Workspace/path/to/library.whl, /Volumes/path/to/library.whl,or gs://bucket-name/path/to/library.whl). See Install libraries from object storage.

PyPI

Enter a PyPI package name. See PyPI package.

Maven

Specify a Maven coordinate. See Maven or Spark package.

CRAN

Enter the name of a package. See CRAN package.

DBFS (Not recommended)

Load a JAR or Whl file to the DBFS root. This is not recommended, as files stored in DBFS can be modified by any workspace user.

When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.

Install a library with an init script

If your library requires custom configuration, you may not be able to install it using the workspace or cluster library interface. Instead, you can install the library using an init script.

Here is an example of an init script that uses pip to install Python libraries on a Databricks Runtime cluster at cluster initialization.

#!/bin/bash

/databricks/python/bin/pip install astropy

Uninstall a library from a cluster

Note

When you uninstall a library from a cluster, the library is removed only when you restart the cluster. Until you restart the cluster, the status of the uninstalled library appears as Uninstall pending restart.

To uninstall a library you can use the cluster UI:

  1. Click compute icon Compute in the sidebar.

  2. Click a cluster name.

  3. Click the Libraries tab.

  4. Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.

Click Restart and Confirm to uninstall the library. The library is removed from the cluster’s Libraries tab.

View the libraries installed on a cluster

  1. Click compute icon Compute in the sidebar.

  2. Click the cluster name.

  3. Click the Libraries tab. For each library, the tab displays the name and version, type, install status, and, if uploaded, the source file.

Update a cluster-installed library

To update a cluster-installed library, uninstall the old version of the library and install a new version.

Note

Requirements.txt files do not require uninstalling and restarting. If you have modified the contents of a requirements.txt file, you can simply reinstall it to update the contents of the installed file.