Databricks Data Science & Engineering guide

Databricks Data Science & Engineering is the classic Databricks environment for collaboration among data scientists, data engineers, and data analysts. It also forms the backbone of the Databricks Machine Learning environment.


If you are a data analyst who works primarily with SQL queries and BI tools, you may prefer the Databricks SQL persona-based environment.

The Databricks Data Science & Engineering guide provides how-to guidance to help you get the most out of the Databricks collaborative analytics platform. For getting started tutorials and introductory information, see Get started with Databricks and Introduction to Databricks.

  • Navigate the workspace

    Learn how to navigate a Databricks workspace and access the assets available in the workspace.

  • Databricks runtimes

    Learn about the types of Databricks runtimes and runtime contents.

  • Clusters

    Learn about Databricks clusters and how to create and manage them.

  • Notebooks

    Learn how to manage and use notebooks in Databricks.

  • Jobs

    Learn how to view, create, and run jobs in Databricks.

  • Task orchestration and pipelines

    Learn how to work with data processing tools in Databricks.

  • Libraries

    Learn how to use and manage libraries in Databricks.

  • Repos for Git integration

    Learn how to manage Databricks notebooks and workspace subfolders as co-versioned repos using Git.

  • Databricks File System (DBFS)

    Learn about Databricks File System (DBFS), a distributed file system mounted into a Databricks workspace and available on Databricks clusters

  • Delta Lake

    Learn about the Delta Lake storage layer and optimizations available with Delta Lake on Databricks.

  • Applications: Genomics

    Learn how to work with genomic data using Databricks and Glow.