DataGrip is an integrated development environment (IDE) for database developers that provides a query console, schema navigation, explain plans, smart code completion, real-time analysis and quick fixes, refactorings, version control integration, and other features.
This article describes how to use your local development machine to install, configure, and use DataGrip to work with databases in Databricks.
Before you install DataGrip, your local development machine must meet the following requirements:
Download and install DataGrip.
Linux: Download the
.zipfile, extract its contents, and then follow the instructions in the
macOS: Download and run the
Windows: Download and run the
For more information, see Install DataGrip on the DataGrip website.
Set up DataGrip with information about the Databricks JDBC Driver that you downloaded earlier.
Click File > Data Sources.
In the Data Sources and Drivers dialog box, click the Drivers tab.
Click the + (Driver) button to add a driver.
For Name, enter
On the General tab, in the Driver Files list, click the + (Add) button.
Click Custom JARs.
Browse to and select the
SparkJDBC42.jarfile that you extracted earlier, and then click Open.
For Class, select com.simba.spark.jdbc.Driver.
Use DataGrip to connect to the cluster that you want to use to access the databases in your Databricks workspace.
In DataGrip, click File > Data Sources.
On the Data Sources tab, click the + (Add) button.
Select the Databricks driver that you added in the preceding step.
On the General tab, for URL, enter the value of the JDBC URL field for your Databricks resource as follows:
Find the JDBC URL field value for your cluster on the JDBC/ODBC tab within the Advanced Options area for your cluster. The JDBC URL should look similar to this one:
<personal-access-token>with your personal access token for the related Databricks workspace.
If you do not want to store your personal access token on your local development machine, omit
UID=token;PWD=<personal-access-token>from the JDBC URL and, in the Save list, select Never. You will be prompted for your User (the word
token) and Password (your personal access token) each time you try to connect.
For Name, enter Databricks cluster.
For more information, see Data sources and drivers dialog on the DataGrip website.
Click Test Connection.
You should start your resource before testing your connection. Otherwise the test might take several minutes to complete while the resource starts.
If the connection succeeds, on the Schemas tab, check the boxes for the schemas that you want to be able to access, for example default.
Repeat the instructions in this step for each resource that you want DataGrip to access.
Use DataGrip to access tables in your Databricks workspace.
In DataGrip, in the Database window, expand your resource node, expand the schema you want to browse, and then expand tables.
Double-click a table. The first set of rows from the table are displayed.
Repeat the instructions in this step to access additional tables.
To access tables in other schemas, in the Database window’s toolbar, click the Data Source Properties icon. In the Data Sources and Drivers dialog box, on the Schemas tab, check the box for each additional schema you want to access, and then click OK.
Use DataGrip to load the sample
diamonds table from the Databricks datasets into the
default database in your workspace and then query the table. For more information, see Create a table in Get started with Databricks as a data scientist. If you do not want to load a sample table, skip ahead to Next steps.
In DataGrip, in the Database window, with the default schema expanded, click File > New > SQL File.
Enter a name for the file, for example
In the file tab, enter these SQL statements, which deletes a table named
diamondsif it exists, and then creates a table named
diamondsbased on the contents of the CSV file within the specified Databricks File System (DBFS) mount point:
DROP TABLE IF EXISTS diamonds; CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true");
On the file tab’s toolbar, click the Execute icon.
Select DROP TABLE IF EXISTS diamonds; CREATE TABLE diamon. .. from the drop-down list.
To change what happens when you click the Execute icon, select Customize in the drop-down list.
In the Database window, double-click the
diamondstable to see its data. If the
diamondstable is not displayed, click the Refresh button in the window’s toolbar.
To delete the
In DataGrip, in the Database window’s toolbar, click the Jump to Query Console button.
Select console (Default).
In the console tab, enter this SQL statement:
DROP TABLE diamonds;
On the console tab’s toolbar, click the Execute icon. The
diamondstable disappears from the list of tables. If the
diamondstable does not disappear, click the Refresh button in the Database window’s toolbar.