What are workspace files?
Support for workspace files is in Public Preview. Files in Repos are generally available.
A workspace file is any file in the Databricks workspace that is not a Databricks notebook. Workspace files can be any file type. Common examples include:
.pyfiles used in custom modules.
.mdfiles, such as
.csvor other small data files.
Databricks provides functionality similar to local development for many workspace file types, including a built-in file editor. Not all use cases for all file types are supported. For example, while you can include images in an imported directory or repository, you cannot embed images in notebooks.
You can create, edit, and manage access to workspace files using familiar patterns from notebook interactions. You can use relative paths for library imports from workspace files, similar to local development. For more details, see:
Workspace files are enabled everywhere by default for Databricks Runtime 11.2 and above. Files in Repos is enabled by default in Databricks Runtime 11.0 and above, and can be manually disabled or enabled. See Configure support for Files in Repos.
In Databricks Runtime 8.4 and above, you can sync, import, and read non-notebook files within a Databricks repo. You can also view and edit files in the Databricks UI.
In Databricks Runtime 11.2 and above, you can programmatically write or delete workspace files within a Databricks repo.
While enabling Files in Repos changes the current working directory for driver operations to the directory containing the notebook executing code, notebooks outside of a repo behave differently when interacting with workspace files, with the current working directory defaulting to the driver block storage volume. See How to work with files on Databricks.
Configure support for Files in Repos
To work with non-notebook files in Databricks Repos, you must be running Databricks Runtime 8.4 or above. You must be running Databricks Runtime 11.2 or above to programmatically create or delete workspace files.
If support for File in Repos is not enabled, you still see non-notebook files in a Databricks repo, but you cannot work with them.
An admin can configure this feature as follows:
Go to the Admin Console.
Click the Workspace settings tab.
In the Repos section, select an option from the Files in Repos dropdown.
To ensure all configurations have been applied, you must refresh your browser and restart your compute cluster.
When you enable Files in Repos for the first time, you might need to open the Git dialog and perform a pull operation to sync non-notebook files in the repo. If there are any merge conflicts, a dialog appears giving you the option to either discard your conflicting changes or push your changes to a new branch.