MLflow for gen AI agent and ML model lifecycle

This article describes how MLflow on Databricks is used to develop high-quality generative AI agents and machine learning models.

Note

If you’re just getting started with Databricks, consider trying MLflow on Databricks Community Edition.

What is MLflow?

MLflow is an open source platform for developing models and generative AI applications. It has the following primary components:

Tracking: Allows you to track experiments to record and compare parameters and results.
Models: Allow you to manage and deploy models from various ML libraries to various model serving and inference platforms.
Model Registry: Allows you to manage the model deployment process from staging to production, with model versioning and annotation capabilities.
AI agent evaluation and tracing: Allows you to develop high-quality AI agents by helping you compare, evaluate, and troubleshoot agents.

MLflow supports Java, Python, R, and REST APIs.

Databricks-managed MLflow

Databricks provides a fully managed and hosted version of MLflow, building on the open source experience to make it more robust and scalable for enterprise use.

The following diagram shows how Databricks integrates with MLflow to train and deploy machine learning models.

MLflow integrates with Databricks to manage the ML lifecycle.

Databricks-managed MLflow is built on Unity Catalog and the Cloud Data Lake to unify all your data and AI assets in the ML lifecycle:

Feature store: Databricks automated feature lookups simplifies integration and reduces mistakes.
Train models: Use Mosaic AI to train models or fine-tune foundation models.
Tracking: MLflow tracks training by logging parameters, metrics, and artifacts to evaluate and compare model performance.
Model Registry: MLflow Model Registry, integrated with Unity Catalog centralizes AI models and artifacts.
Model Serving: Mosaic AI Model Serving deploys models to a REST API endpoint.
Monitoring: Mosaic AI Model Serving automatically captures requests and responses to monitor and debug models. MLflow augments this data with trace data for each request.

Model training

MLflow Models are at the core of AI and ML development on Databricks. MLflow Models are a standardized format for packaging machine learning models and generative AI agents. The standardized format ensures that models and agents can be used by downstream tools and workflows on Databricks.

MLflow documentation - Models.

Databricks provides features to help you train different kinds of ML models.

Train AI models using Mosaic AI.

Experiment tracking

Databricks uses MLflow experiments as organizational units to track your work while developing models.

Experiment tracking lets you log and manage parameters, metrics, artifacts, and code versions during machine learning training and agent development. Organizing logs into experiments and runs allows you to compare models, analyze performance, and iterate more easily.

Experiment tracking using Databricks.
See MLflow documentation for general information on runs and experiment tracking.

Model Registry with Unity Catalog

MLflow Model Registry is a centralized model repository, UI, and set of APIs for managing the model deployment process.

Databricks integrates Model Registry with Unity Catalog to provide centralized governance for models. Unity Catalog integration allows you to access models across workspaces, track model lineage, and discover models for reuse.

Manage models using Databricks Unity Catalog.
See MLflow documentation for general information on Model Registry.

Model Serving

Databricks Model Serving is tightly integrated with MLflow Model Registry and provides a unified, scalable interface for deploying, governing, and querying AI models. Each model you serve is available as a REST API that you can integrate into web or client applications.

While they are distinct components, Model Serving heavily relies on MLflow Model Registry to handle model versioning, dependency management, validation, and governance.

Model Serving using Databricks.

Open source vs. Databricks-managed MLflow features

For general MLflow concepts, APIs, and features shared between open source and Databricks-managed versions, refer to MLflow documentation. For features exclusive to Databricks-managed MLflow, see Databricks documentation.

The following table highlights the key differences between open source MLflow and Databricks-managed MLflow and provides documentation links to help you learn more:

Feature	Availability on open source MLflow	Availability on Databricks-managed MLflow
Security	User must provide their own security governance layer	Databricks enterprise-grade security
Disaster recovery	Unavailable	Databricks disaster recovery
Experiment tracking	MLflow Tracking API	MLflow Tracking API integrated with Databricks advanced experiment tracking
Model Registry	MLflow Model Registry	MLflow Model Registry integrated with Databricks Unity Catalog
Unity Catalog integration	Open source integration with Unity Catalog	Databricks Unity Catalog
Model deployment	User-configured integrations with external serving solutions (SageMaker, Kubernetes, any container service, etc.)	Databricks Model Serving and external serving solutions
AI agents	MLflow LLM development	MLflow LLM development
Encryption	Unavailable	Encryption using customer-managed keys