This article describes how to deploy MLflow models for offline (batch and streaming) inference. Databricks recommends that you use MLflow to deploy machine learning models for batch or streaming inference. For general information about working with MLflow models, see Log, load, register, and deploy MLflow models.
MLflow helps you generate code for batch or streaming inference.
In the MLflow Run page for your model, you can copy the generated code snippet for inference on pandas or Apache Spark DataFrames.
You can also customize the code generated by either of the above options. See the following notebooks for examples:
The model inference example uses a model trained with scikit-learn and previously logged to MLflow to show how to load a model and use it to make predictions on data in different formats. The notebook illustrates how to apply the model as a scikit-learn model to a pandas DataFrame, and how to apply the model as a PySpark UDF to a Spark DataFrame.
The MLflow Model Registry example shows how to build, manage, and deploy a model with Model Registry. On that page, you can search for
.predictto identify examples of offline (batch) predictions.
To run batch or streaming predictions as a job, create a notebook or JAR that includes the code used to perform the predictions. Then, execute the notebook or JAR as a Databricks job. Jobs can be run either immediately or on a schedule.
For information about and examples of deep learning model inference on Databricks, see the following articles:
For scalable model inference with MLlib and XGBoost4J models, use the native
transform methods to perform inference directly on Spark DataFrames. The MLlib example notebooks include inference steps.
When you use the MLflow APIs to run inference on Spark DataFrames, you can load the model as a Spark UDF and apply it at scale using distributed computing.
You can customize your model to add pre-processing or post-processing and to optimize computational performance for large models. A good option for customizing models is the MLflow pyfunc API, which allows you to wrap a model with custom logic.
For smaller datasets, you can also use the native model inference routines provided by the library.