Databricks datasets

Databricks includes a variety of datasets mounted to Databricks File System (DBFS). These datasets are used in examples throughout the documentation.

Note

In this release of Databricks on Google Cloud, these files are hosted on Amazon S3.

To browse these files in Data Science & Engineering or Databricks Machine Learning using Python, R, or Scala, you can use Databricks Utilities. Here’s a Python example that you can use in a notebook to list all of the Databricks datasets.

display(dbutils.fs.ls("/databricks-datasets"))