Use JupyterLab with Databricks Connect for Python

Note

This article covers Databricks Connect for Databricks Runtime 13.0 and above.

This article covers how to use Databricks Connect for Python with JupyterLab. Databricks Connect enables you to connect popular notebook servers, IDEs, and other custom applications to Databricks clusters. See What is Databricks Connect?.

Note

Before you begin to use Databricks Connect, you must set up the Databricks Connect client.

To use Databricks Connect with JupyterLab and Python, follow these instructions.

  1. To install JupyterLab, with your Python virtual environment activated, run the following command from your terminal or Command Prompt:

    pip3 install jupyterlab
    
  2. To start JupyterLab in your web browser, run the following command from your activated Python virtual environment:

    jupyter lab
    

    If JupyterLab does not appear in your web browser, copy the URL that starts with localhost or 127.0.0.1 from your virtual environment, and enter it in your web browser’s address bar.

  3. Create a new notebook: in JupyterLab, click File > New > Notebook on the main menu, select Python 3 (ipykernel) and click Select.

  4. In the notebook’s first cell, enter either the example code or your own code. If you use your own code, at minimum you must initialize DatabricksSession as shown in the example code.

  5. To run the notebook, click Run > Run All Cells. All code runs locally, while all code involving DataFrame operations runs on the cluster in the remote Databricks workspace and run responses are sent back to the local caller.

  6. To debug the notebook, click the bug (Enable Debugger) icon next to Python 3 (ipykernel) in the notebook’s toolbar. Set one or more breakpoints, and then click Run > Run All Cells. All code is debugged locally, while all Spark code continues to run on the cluster in the remote Databricks workspace. The core Spark engine code cannot be debugged directly from the client.

  7. To shut down JupyterLab, click File > Shut Down. If the JupyterLab process is still running in your terminal or Command Prompt, stop this process by pressing Ctrl + c and then entering y to confirm.

For more specific debug instructions, see Debugger.