Use classic Jupyter Notebook with Databricks Connect for Python

Note

This article covers Databricks Connect for Databricks Runtime 13.0 and above.

This article covers how to use Databricks Connect for Python with classic Jupyter Notebook. Databricks Connect enables you to connect popular notebook servers, IDEs, and other custom applications to Databricks clusters. See What is Databricks Connect?.

Note

Before you begin to use Databricks Connect, you must set up the Databricks Connect client.

To use Databricks Connect with classic Jupyter Notebook and Python, follow these instructions.

  1. To install classic Jupyter Notebook, with your Python virtual environment activated, run the following command from your terminal or Command Prompt:

    pip3 install notebook
    
  2. To start classic Jupyter Notebook in your web browser, run the following command from your activated Python virtual environment:

    jupyter notebook
    

    If classic Jupyter Notebook does not appear in your web browser, copy the URL that starts with localhost or 127.0.0.1 from your virtual environment, and enter it in your web browser’s address bar.

  3. Create a new notebook: in classic Jupyter Notebook, on the Files tab, click New > Python 3 (ipykernel).

  4. In the notebook’s first cell, enter either the example code or your own code. If you use your own code, at minimum you must initialize DatabricksSession as shown in the example code.

  5. To run the notebook, click Cell > Run All. All Python code runs locally, while all PySpark code involving DataFrame operations runs on the cluster in the remote Databricks workspace and run responses are sent back to the local caller.

  6. To debug the notebook, add the following line of code at the beginning of your notebook:

    from IPython.core.debugger import set_trace

    And then call set_trace() to enter debug statements at that point of notebook execution. All Python code is debugged locally, while all PySpark code continues to run on the cluster in the remote Databricks workspace. The core Spark engine code cannot be debugged directly from the client.

  7. To shut down classic Jupyter Notebook, click File > Close and Halt. If the classic Jupyter Notebook process is still running in your terminal or Command Prompt, stop this process by pressing Ctrl + c and then entering y to confirm.