JDBC and ODBC drivers and configuration parameters

You can connect business intelligence (BI) tools to Databricks clusters to query data in tables. This article describes how to get JDBC and ODBC drivers and configuration parameters to connect to Databricks clusters.

For tool-specific connection instructions, see the Databricks integrations guide.

Permission requirements

The permissions required to access compute resources using JDBC or ODBC depend on whether you are connecting to a Databricks cluster.

Cluster requirements

To access a cluster, you must have Can Attach To permission.

If you connect to a terminated cluster and have Can Restart permission, the cluster is started.

Prepare to connect BI tools

This section describes the steps you typically follow to prepare to connect to BI tools:

Step 1: Download and install a JDBC or ODBC driver

For some BI tools, you use a JDBC or ODBC driver to make a connection to Databricks compute resources.

  1. Go to the Databricks JDBC or ODBC driver download page and download it. For ODBC, pick the right driver for your operating system.
  2. Install the driver. For JDBC, a JAR is provided which does not require installation. For ODBC, an installation package is provided for your chosen platform.

Step 2: Collect JDBC or ODBC connection information

To configure a JDBC or ODBC driver, you must collect connection information from Databricks. Here are some of the parameters a JDBC or ODBC driver might require:

Parameter Value
Authentication See Username and password authentication.
Host, port, HTTP path, JDBC URL See Get server hostname, port, HTTP path, and JDBC URL.

The following are usually specified in the httpPath for JDBC and the DSN conf for ODBC:

Parameter Value
Spark server type Spark Thrift Server
Schema/Database default
Authentication mechanism AuthMech See Username and password authentication.
Thrift transport http
SSL 1

Username and password authentication

This section describes how to collect the credentials for authenticating BI tools to Databricks compute resources.

When you configure authentication from your BI tool to Databricks resources the fields you set are labeled Username and Password. Set Username to the string token and Password to a personal access token you create using the instructions in Generate a personal access token.

Get server hostname, port, HTTP path, and JDBC URL

Get connection details for cluster

  1. Click compute icon Compute in the sidebar.
  2. Click a cluster.
  3. Click the Advanced Options toggle.
  4. Click the JDBC/ODBC tab.
  5. Copy the parameters required by your BI tool.

Configure JDBC URL

Configure JDBC URL for cluster

In the following URL, replace <personal-access-token> with the token you created in Username and password authentication. For example:

jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=<personal-access-token>

Configure connection for native query syntax

JDBC and ODBC drivers accept SQL queries in ANSI SQL-92 dialect and translate the queries to Spark SQL. If your application generates Spark SQL directly or your application uses any non-ANSI SQL-92 standard SQL syntax specific to Databricks, Databricks recommends that you add ;UseNativeQuery=1 to the connection configuration. With that setting, drivers pass the SQL queries verbatim to Databricks.

Configure ODBC Data Source Name for the Simba ODBC driver

The Data Source Name (DSN) configuration contains the parameters for communicating with a specific database. BI tools like Tableau usually provide a user interface for entering these parameters. If you have to install and manage the Simba ODBC driver yourself, you might need to create the configuration files and also allow your Driver Manager (ODBC Data Source Administrator on Windows and unixODBC/iODBC on Unix) to access them. Create two files: /etc/odbc.ini and /etc/odbcinst.ini.

/etc/odbc.ini

  1. Set the content of /etc/odbc.ini to:

    [Databricks-Spark]
    Driver=Simba
    Server=<server-hostname>
    HOST=<server-hostname>
    PORT=<port>
    SparkServerType=3
    Schema=default
    ThriftTransport=2
    SSL=1
    AuthMech=3
    UID=token
    PWD=<personal-access-token>
    HTTPPath=<http-path>
    
  2. Set <personal-access-token> to the token you retrieved in Username and password authentication.

  3. Set the server, port, and HTTP parameters to the ones you retrieved in Get server hostname, port, HTTP path, and JDBC URL.

/etc/odbcinst.ini

Set the content of /etc/odbcinst.ini to:

[ODBC Drivers]
Simba = Installed
[Simba Spark ODBC Driver 64-bit]
Driver = <driver-path>

Set <driver-path> according to the operating system you chose when you downloaded the driver in Step 1:

  • MacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylib
  • Linux (64-bit) /opt/simba/spark/lib/64/libsparkodbc_sb64.so
  • Linux (32-bit) /opt/simba/spark/lib/32/libsparkodbc_sb32.so

Configure paths of ODBC configuration files

Specify the paths of the two files in environment variables so that they can be used by the Driver Manager:

export ODBCINI=/etc/odbc.ini
export ODBCSYSINI=/etc/odbcinst.ini
export SIMBASPARKINI=<simba-ini-path>/simba.sparkodbc.ini # (Contains the configuration for debugging the Simba driver)

where <simba-ini-path> is

  • MacOS /Library/simba/spark/lib
  • Linux (64-bit) /opt/simba/sparkodbc/lib/64
  • Linux (32-bit) /opt/simba/sparkodbc/lib/32