Alteryx

This article describes how to use Alteryx with Databricks.

Requirements

Alteryx 10.6 and above. In-database processing requires 64-bit Alteryx with 64-bit database drivers.

Step 1: Get Databricks connection information

  1. Get a personal access token.
  2. Get the server hostname, port, and HTTP path.

Step 2: Configure the Simba Spark ODBC driver

Step 3: Configure connection in Alteryx to a Databricks cluster

  1. In Alteryx Designer, go to the In-Database tool tab.
  2. Drag a Connect In-DB tool onto the canvas.
  3. In the Configuration Panel, click the drop-down menu under Connection Name.
  4. Select Manage Connections…
  5. Enter a Connection Type of User.
  6. Under Connections, click New and enter the following:
    • Connection Name: Databricks (or other preferred name)
    • Password Encryption: Encrypt for User (or other if preferred)
  7. Click the Read tab and enter the following:
    1. Driver: Spark ODBC
    2. Click the drop-down menu under Connection String.
    3. Select New Database Connection….
      1. Click the Spark Data Source Name drop-down and select Databricks (User).
      2. Click OK.
  8. Click the Write tab and enter the following:
    1. Driver: Databricks Bulk Loader (Avro) or (CSV)
    2. Click the drop-down menu under Connection String.
    3. Select New Databricks Connection….
    4. Under the ODBC Data Source select Databricks (User).
      • In the Username field, enter token.
      • In the Password field, enter your personal access token from Step 2.
      • In Databricks URL, enter https:// + the host from Step 2.
  9. Click OK.