Connect to Labelbox

Labelbox is a training data platform used to create training data from images, video, audio, text, and tiled imagery. Using Labelbox, AI teams can customize a workflow to operate, manage and improve data labeling, data cataloging, and model debugging in a single, unified platform. Labelbox is designed to help AI teams build and operate production-grade machine learning systems.

You can connect your Databricks clusters that have the Machine Learning version of the Databricks Runtime to Labelbox.

Connect to Labelbox using Partner Connect

This section describes how to connect a cluster in your Databricks workspace to Labelbox using Partner Connect.

Differences between standard connections and Labelbox

To connect to Labelbox using Partner Connect, you follow the steps in Connect to a machine learning partner using Partner Connect. The Labelbox connection is different from standard machine learning connections in the following ways:

  • In addition to a cluster, a service principal, and a personal access token, Partner Connect creates a notebook named labelbox_databricks_example.ipynb in the Workspace/Shared/labelbox_demo folder in your Labelbox account, if it doesn’t already exist.

Steps to connect

To connect to Labelbox using Partner Connect, do the following:

  1. Connect to a machine learning partner using Partner Connect.

  2. Create a Labelbox API key for your Labelbox account, if you do not have one. Copy the API key and save it in a secure location, as the key will eventually be hidden from view, and you will need this key later.

  3. Set up the ML cluster and Labelbox starter notebook.

Connect to Labelbox manually

The steps in this section describe how to connect Labelbox to a Databricks cluster.

Note

To connect faster, use Partner Connect.

Requirements

You must have an available cluster running Databricks Runtime for Machine Learning. To check this for an existing cluster, look for ML in the Runtime column when you display the cluster in your workspace. If you do not have an available Databricks Runtime ML cluster, create a cluster and for Databricks Runtime Version, choose a version from the ML list.

Steps to connect

To connect to Labelbox manually, do the following:

  1. Go to the Labelbox page to Sign Up for a new Labelbox account or to Log In to your existing Labelbox account.

  2. Create a Labelbox API key for your Labelbox account, if you do not have one. Copy the API key and save it in a secure location, as the key will eventually be hidden from view, and you will need this key later.

  3. Check for a Labelbox starter notebook in your workspace:

    1. In your Databricks workspace, ensure that you are in the Data Science & Engineering or Databricks Machine Learning environment. Use the sidebar persona-switcher if necessary.

    2. In the sidebar, click Workspace > Shared.

    3. If a folder named labelbox_demo does not already exist, create it:

      1. Click the down arrow next to Shared.

      2. Click Create > Folder.

      3. Enter labelbox_demo,

      4. Click Create Folder.

    4. Click the labelbox_demo folder. If a starter notebook named labelbox_databricks_example.ipynb does not exist in the folder, import it:

      1. Click the down arrow next to labelbox_demo.

      2. Click Import.

      3. Click URL.

      4. Enter https://github.com/Labelbox/labelbox-python/blob/develop/examples/integrations/databricks/labelbox_databricks_example.ipynb and click Import.

  4. Continue with Set up the ML cluster and Labelbox starter notebook.

Set up the ML cluster and Labelbox starter notebook

  1. In your Databricks workspace, ensure that you are still in the Data Science & Engineering or Databricks Machine Learning environment. Use the sidebar persona-switcher if necessary.

  2. Check that the required Labelbox libraries are installed in your ML cluster:

    1. In the sidebar, click Compute.

    2. Click your ML cluster. Use the Filter box to find it, if necessary.

      Note

      If you used Partner Connect to connect to Labelbox, the ML cluster’s name should be LABELBOX_CLUSTER.

    3. Click the Libraries tab.

    4. If the labelbox package is not listed, install it:

      1. Click Install New.

      2. Click PyPI.

      3. For Package, enter labelbox.

      4. Click Install.

    5. If the labelspark package is not listed, install it:

      1. Click Install New.

      2. Click PyPI.

      3. For Package, enter labelspark.

      4. Click Install.

  3. Attach your ML cluster to the starter notebook:

    1. In the sidebar, click Workspace > Shared > labelbox_demo > labelbox_databricks_example.ipynb.

    2. Attach your ML cluster to the notebook.

  4. Browse through the notebook to learn how to automate Labelbox.

Additional resources