You can programmatically restart the Python process on Databricks to ensure that locally installed or upgraded libraries function correctly in the Python kernel for your current SparkSession.
When you restart the Python process, you lose Python state information. Databricks recommends installing all session-scoped libraries at the beginning of a notebook and running
dbutils.library.restartPython() to clean up the Python process before proceeding.
You can use this process in interactive notebooks or for Python tasks scheduled with workflows.
The helper function
dbutils.library.restartPython() is the recommended way to restart the Python process in a Databricks notebook.
Most functions in the
dbutils.library submodule are deprecated. Databricks strongly recommends using
%pip to manage all notebook-scoped library installations. See Notebook-scoped Python libraries.
It is a good idea to restart your Python process anytime you perform a local installation that includes any of the following:
Specifying a version of a package included in Databricks Runtime.
Installing a custom version of a package included in Databricks Runtime.
Explicitly updating a library to the newest version using
%pip install <library-name> --upgrade.
Configuring a custom environment from a local
Installing a library that requires changing the versions of dependent libraries that are included in Databricks Runtime.