Get started with machine learning in Databricks

Note

The managed MLflow integration with Databricks on Google Cloud requires Databricks Runtime for Machine Learning 8.1 or above.

This notebook provides a quick overview of machine learning model training on Databricks. To train models, you can use libraries like scikit-learn that are preinstalled Databricks Runtime ML. In addition, you can use MLflow to track the trained models, and Hyperopt with SparkTrials to scale hyperparameter tuning.

In this tutorial, you train a simple classification model using MLflow to track model development and Hyperopt to improve the model’s performance. For more details on productionizing machine learning on Databricks including model lifecycle management and model inference, see the ML end-to-end example.

For additional example notebooks to get started quickly on Databricks, see 10-minute tutorials: Get started with machine learning on Databricks.

Requirements

Databricks Runtime 7.5 ML or above.

Note

If you do not have access to Databricks Runtime 7.5 ML or above, try Get started with scikit-learn in Databricks or End-to-end example of building machine learning models on Databricks.

Example notebook

Machine learning quickstart notebook

Open notebook in new tab