Databricks Data Science & Engineering guide

Databricks Data Science & Engineering is the classic Databricks environment for collaboration among data scientists, data engineers, and data analysts. It also forms the backbone of the Databricks Machine Learning environment.

Note

If you are a data analyst who works primarily with SQL queries and BI tools, you may prefer the Databricks SQL persona-based environment.

The Databricks Data Science & Engineering guide provides how-to guidance to help you get the most out of the Databricks collaborative analytics platform. For getting started tutorials and introductory information, see Get started: Free trial & setup and What is Databricks?.

  • Delta Live Tables

    Learn how to build data pipelines for ingestion and transformation with Databricks Delta Live Tables.

  • Structured Streaming

    Learn how to use Apache Spark Structured Streaming to express computation on streaming data in Databricks.

  • Apache Spark

    Learn how Apache Spark works on Databricks and the Databricks Lakehouse Platform.

  • Runtimes

    Learn about the types of Databricks runtimes and runtime contents.

  • Clusters

    Learn about Databricks clusters and how to create and manage them.

  • Notebooks

    Learn what a Databricks notebook is, and how to use and manage notebooks to process, analyze, and visualize your data.

  • Workflows

    Learn how to orchestrate data processing, machine learning, and data analysis workflows on the Databricks Lakehouse platform.

  • Storage

    Learn how Databricks uses cloud object storage and block storage volumes for persistent and ephemeral data storage.

  • Libraries

    Learn how to make third-party or custom code available in Databricks using libraries. Learn about the different modes for installing libraries on Databricks.

  • Repos

    Learn how to use Git to version control your notebooks and other files for development in Databricks.

  • DBFS

    Learn about Databricks File System (DBFS), a distributed file system mounted into a Databricks workspace and available on Databricks clusters

  • Files

    Learn about options for working with files on Databricks.

  • Migration

    Learn how to migrate data applications such as ETL jobs, enterprise data warehouses, ML, data science, and analytics to Databricks.

  • Optimization & performance

    Learn about optimizations and performance recommendations on Databricks.