Generative AI models maintenance policy

This article describes the model maintenance policy for the Foundation Model APIs pay-per-token offering.

In order to continue supporting the most state-of-the-art models, Databricks might update supported models or retire older models for Foundation Model APIs pay-per-token offering.

Model retirement policy

The following retirement policy applies only to supported chat and completion models in the Foundation Model APIs pay-per-token offering.

When a model is retired, it is no longer available for use and is removed from the indicated feature offerings. Databricks takes the following steps to notify customers about a model that is set for retirement:

A warning message displays in the model card from the Serving page of your Databricks workspace that indicates that the model is planned for retirement.
The applicable documentation contains a notice that indicates the model is planned for retirement and the start date it will no longer be supported.

After users are notified about the upcoming model retirement, Databricks will retire the model in three months. During this three-month period, customers can either:

Choose to migrate to a provisioned throughput endpoint to continue using the model past its end-of-life date
Migrate existing workflows to use recommended replacement models.

On the retirement date, the model is removed from the product, and applicable documentation is updated to recommend using a replacement model.

Model updates

Databricks might ship incremental updates to pay-per-token models to deliver optimizations. When a model is updated, the endpoint URL remains the same, but the model ID in the response object changes to reflect the date of the update. For example, if an update is shipped to meta-llama/Meta-Llama-3.3-70B on 3/4/2024, the model name in the response object updates to meta-llama/Meta-Llama-3.3-70B-030424. Databricks maintains a version history of the updates that you can refer to.