Get started with scikit-learn in Databricks

Note

The managed MLflow integration with Databricks on Google Cloud requires Databricks Runtime for Machine Learning 8.1 or above.

This 10-minute tutorial is designed as an introduction to machine learning in Databricks. It uses algorithms from the popular machine learning package scikit-learn along with MLflow for tracking the model development process and Hyperopt to automate hyperparameter tuning.

Requirements

Databricks Runtime ML

Example notebooks

If you are using Databricks Runtime ML, Databricks recommends using MLflow autologging, illustrated in this notebook.

Get started with scikit-learn and MLflow autologging notebook

Open notebook in new tab

For use with Databricks Runtime ML, this notebook uses manual MLflow logging to track model development.

Get started with scikit-learn notebook

Open notebook in new tab