Train Spark ML models on Databricks Connect with


This feature is in Public Preview.

This article provides an example that demonstrates how to use the module to perform distributed training to train Spark ML models and run model inference on Databricks Connect.

What is

Spark 3.5 introduces which is designed for supporting Spark connect mode and Databricks Connect. Learn more about Databricks Connect.

The module consists of common learning algorithms and utilities, including classification, feature transformers, ML pipelines, and cross validation. This module provides similar interfaces to the legacy `` module, but the module currently only contains a subset of the algorithms in The supported algorithms are listed below:

  • Classification algorithm:

  • Feature transformers: and

  • Evaluator:, and MulticlassClassificationEvaluator

  • Pipeline:

  • Model tuning:


Example notebook

The following notebook demonstrates how to use Distributed ML on Databricks Connect:

Distributed ML on Databricks Connect

Open notebook in new tab

For reference information about APIs in, Databricks recommends the Apache Spark API reference