Developer tools

Databricks provides an ecosystem of tools to help you develop applications and solutions that integrate with Databricks and programmatically manage Databricks resources and data.

This article provides an overview of these tools and recommendations for the best tools for common developer scenarios.

What tools does Databricks provide for developers?

The following table provides a list of developer tools provided by Databricks.

Tool

Description

Authentication and authorization

Configure authentication and authorization for your tools, scripts, and apps to work with Databricks.

Databricks Connect

Connect to Databricks using popular integrated development environments (IDEs) such as PyCharm, IntelliJ IDEA, Eclipse, RStudio, and JupyterLab.

If you are using Visual Studio Code, Databricks recommends the Databricks extension for Visual Studio Code, which is built on top of Databricks Connect, as it provides additional features to enable easier configuration.

Databricks extension for Visual Studio Code

Connect to your remote Databricks workspaces from the Visual Studio Code integrated development environment (IDE).

PyCharm Databricks plugin

Configure a connection to a remote Databricks workspace and run files on Databricks clusters from PyCharm. This plugin is developed and provided by JetBrains in partnership with Databricks.

Databricks SDKs

Automate Databricks from code libraries written for popular languages such as Python, Java, Go, and R. Instead of sending REST API calls directly using curl/ Postman, you can use an SDK to interact with Databricks using a programming language of your choice.

SQL drivers and tools

Connect to Databricks to run SQL commands and scripts, interact programmatically with Databricks, and integrate Databricks SQL functionality into applications written in popular languages such as Python, Go, JavaScript and TypeScript.

Databricks CLI

Access Databricks functionality using the Databricks command-line interface (CLI). The CLI wraps the Databricks REST API, so instead of sending REST API calls directly using curl or Postman, you can use the Databricks CLI to interact with Databricks.

Databricks Asset Bundles

Implement industry-standard development, testing, and deployment (CI/CD) best practices for your Databricks data and AI projects using Databricks Asset Bundles (DABs).

Databricks Terraform provider and Terraform CDKTF for Databricks

Provision Databricks infrastructure and resources using Terraform.

CI/CD tools

Integrate popular CI/CD systems and frameworks such as GitHub Actions, Jenkins, and Apache Airflow.

Tip

You can also connect many additional popular third-party tools to clusters and SQL warehouses to access data in Databricks. See the Technology partners.

Which developer tool should I use?

The following table outlines Databricks tool recommendations for common developer scenarios.

Scenarios

Recommendation

  • Interactive development and debugging from a local IDE

Databricks extension for Visual Studio Code

PyCharm Databricks plugin

For other IDEs, use Databricks CLI with Databricks Connect

  • Direct interaction with Databricks from the command line

  • Shell scripting

  • Experimentation

  • Invoke the REST API directly

  • Manage local authentication profiles

  • Sync code from the IDE to the Databricks workspace

Databricks CLI

  • Manage workflows and deploy projects to Databricks

  • Apply CI/CD best practices

  • Co-version, co-author, co-deploy your resources and assets as one unit

  • Supports the most common resources

Databricks Asset Bundles (a feature of the CLI)

  • Infrastructure as code, CI/CD

  • Administer and create workspaces, catalogs, metastores, and enforce permissions

  • Guarantee environment portability and disaster recovery

  • Many supported resources

Databricks Terraform provider

  • Application development

  • Integrate with existing deployment systems

  • Create custom Databricks workflows and new web services

Databricks Python SDK

Databricks Java SDK

Databricks Go SDK

Databricks R SDK

  • Advanced scenarios only

  • Almost all Databricks resources are available

Databricks REST API