Restart the Python process on Databricks
You can programmatically restart the Python process on Databricks to ensure that locally installed or upgraded libraries function correctly in the Python kernel for your current SparkSession.
When you restart the Python process, you lose Python state information. Databricks recommends installing all session-scoped libraries at the beginning of a notebook and running dbutils.library.restartPython()
to clean up the Python process before proceeding.
You can use this process in interactive notebooks or for Python tasks scheduled with jobs.
What is dbutils.library.restartPython
?
The helper function dbutils.library.restartPython()
is the recommended way to restart the Python process in a Databricks notebook.
Note
Most functions in the dbutils.library
submodule are deprecated. Databricks strongly recommends using %pip
to manage all notebook-scoped library installations. See Notebook-scoped Python libraries.
When should you restart your Python process?
It is a good idea to restart your Python process anytime you perform a local installation that includes any of the following:
Specifying a version of a package included in Databricks Runtime.
Installing a custom version of a package included in Databricks Runtime.
Explicitly updating a library to the newest version using
%pip install <library-name> --upgrade
.Configuring a custom environment from a local
requirements.txt
file.Installing a library that requires changing the versions of dependent libraries that are included in Databricks Runtime.