Install Databricks Connect for Scala
Note
This article covers Databricks Connect for Databricks Runtime 13.3 LTS and above.
This article describes how to install Databricks Connect for Scala. See What is Databricks Connect?. For the Python version of this article, see Install Databricks Connect for Python.
Requirements
Your target Databricks workspace and cluster must meet the requirements for Compute configuration for Databricks Connect.
The Java Development Kit (JDK) installed on your development machine. Databricks recommends that the version of your JDK installation that you use matches the JDK version on your Databricks cluster. To find the JDK version on your cluster, refer to the “System environment” section of the Databricks Runtime release notes for your cluster. For instance,
Zulu 8.70.0.23-CA-linux64
corresponds to JDK 8. See Databricks Runtime release notes versions and compatibility.Scala installed on your development machine. Databricks recommends that the version of your Scala installation matches the Scala version on your Databricks cluster. To find the Scala version of the Databricks Runtime version of your cluster, refer to the System environment section of the Databricks Runtime release notes for that version. See Databricks Runtime release notes versions and compatibility.
If you are using user-defined functions (UDFs), the local Scala and Java versions must match the Scala and Java versions of the Databricks Runtime version of the cluster. To find the Scala and Java versions of the Databricks Runtime version of your cluster, refer to the System environment section of the Databricks Runtime release notes for that version. See Databricks Runtime release notes versions and compatibility.
A Scala build tool on your development machine, such as
sbt
.
Add a reference to the Databricks Connect client
To set up the Databricks Connect client, first add a reference to the client. In your Scala project’s build file such as build.sbt
for sbt
, pom.xml
for Maven, or build.gradle
for Gradle, add the following reference to the Databricks Connect client. Replace 14.0.0
with the version of the Databricks Connect library that matches the Databricks Runtime version on your cluster. You can find the Databricks Connect library version numbers in the Maven central repository.
libraryDependencies += "com.databricks" % "databricks-connect" % "14.0.0"
<dependency>
<groupId>com.databricks</groupId>
<artifactId>databricks-connect</artifactId>
<version>14.0.0</version>
</dependency>
implementation 'com.databricks.databricks-connect:14.0.0'
Configure connection properties
Next, configure properties to establish a connection between Databricks Connect and your remote Databricks cluster. These properties include settings to authenticate Databricks Connect with your cluster. See Compute configuration for Databricks Connect.
For Databricks Connect for Databricks Runtime 13.3 LTS and above, for Scala, Databricks Connect includes the Databricks SDK for Java. This SDK implements the Databricks client unified authentication standard, a consolidated and consistent architectural and programmatic approach to authentication. This approach makes setting up and automating authentication with Databricks more centralized and predictable. It enables you to configure Databricks authentication once and then use that configuration across multiple Databricks tools and SDKs without further authentication configuration changes.
Note
OAuth user-to-machine (U2M) authentication is supported on Databricks SDK for Java 0.18.0 and above. You might need to update your code project’s installed version of the Databricks SDK for Java to 0.18.0 or above to use OAuth U2M authentication. See Get started with the Databricks SDK for Java.
For OAuth U2M authentication, you must use the Databricks CLI to authenticate before you run your Scala code. See the Tutorial.
OAuth machine-to-machine (M2M) authentication is supported on Databricks SDK for Java 0.17.0 and above. You might need to update your code project’s installed version of the Databricks SDK for Java to 0.17.0 or above to use OAuth U2M authentication. See Get started with the Databricks SDK for Java.
Google Cloud credentials authentication and Google Cloud ID authentication are supported on Databricks SDK for Java 0.14.0 and above. You might need to update your code project’s installed version of the Databricks SDK for Java to 0.14.0 or above to use Google Cloud credentials authentication or ID authentication. See Get started with the Databricks SDK for Java.