Restart the Python process on Databricks

You can programmatically restart the Python process on Databricks to ensure that locally installed or upgraded libraries function correctly in the Python kernel for your current SparkSession.

When you restart the Python process, you lose Python state information. Databricks recommends installing all session-scoped libraries at the beginning of a notebook and running dbutils.library.restartPython() to clean up the Python process before proceeding.

You can use this process in interactive notebooks or for Python tasks scheduled with workflows.

What is dbutils.library.restartPython?

The helper function dbutils.library.restartPython() is the recommended way to restart the Python process in a Databricks notebook.

Note

Most functions in the dbutils.library submodule are deprecated. Databricks strongly recommends using %pip to manage all notebook-scoped library installations. See Notebook-scoped Python libraries.

When should you restart your Python process?

It is a good idea to restart your Python process anytime you perform a local installation that includes any of the following:

  • Specifying a version of a package included in Databricks Runtime.

  • Installing a custom version of a package included in Databricks Runtime.

  • Explicitly updating a library to the newest version using %pip install <library-name> --upgrade.

  • Configuring a custom environment from a local requirements.txt file.

  • Installing a library that requires changing the versions of dependent libraries that are included in Databricks Runtime.