Mosaic AI Gateway

Preview

This feature is in Public Preview.

This article describes Mosaic AI Gateway, the Databricks solution for governing and monitoring access to supported generative AI models and their associated model serving endpoints.

What is Mosaic AI Gateway?

Mosaic AI Gateway is designed to streamline the usage and management of generative AI models within an organization. It is a centralized service that brings governance, monitoring, and production readiness to model serving endpoints. It also allows you to run, secure, and govern AI traffic to democratize and accelerate AI adoption for your organization.

All data is logged into Delta tables in Unity Catalog.

To start visualizing insights from your AI Gateway data, download the example AI Gateway dashboard from GitHub. This dashboard leverages the data from the usage tracking and payload logging inference tables.

After you download the JSON file, import the dashboard into your workspace. For instructions on importing dashboards, see Import a dashboard file.

AI Gateway supports the following features:

  • Permission and rate limiting to control who has access and how much access.

  • Payload logging to monitor and audit data being sent to model APIs using inference tables.

  • Usage tracking to monitor operational usage on endpoints and associated costs using system tables.

  • Traffic routing to minimize production outages during and after deployment.

Mosaic AI Gateway incurs charges on an enabled feature basis. During preview these paid features include payload logging and usage tracking. Features such as query permissions, rate limiting, and traffic routing are free of charge. Any new features are subject to charge.

Use AI Gateway

You can configure AI Gateway features on your model serving endpoints using the Serving UI. See Configure AI Gateway on model serving endpoints.

Limitations

AI Gateway is only supported for model serving endpoints that serve external models.