Preview

This feature is in Public Preview and is supported in us-east1 and us-central1.

Foundation model REST API reference

This article provides general API information for Databricks Foundation Model APIs and the models they support. The Foundation Model APIs are designed to be similar to OpenAI’s REST API to make migrating existing projects easier.

Endpoints

Provisioned throughput endpoints can be created using the API or the Serving UI. These endpoints also support multiple models per endpoint for A/B testing, as long as both served models expose the same API format. For example, both models are embedding models.

Usage

Responses include a usage sub-message which reports the number of tokens in the request and response. The format of this sub-message is the same across all task types.

Field

Type

Description

completion_tokens

Integer

Number of generated tokens. Not included in embedding responses.

prompt_tokens

Integer

Number of tokens from the input prompt(s).

total_tokens

Integer

Number of total tokens.

Embedding task

Embedding tasks map input strings into embedding vectors. Many inputs can be batched together in each request. See POST /serving-endpoints/{name}/invocations for querying endpoint parameters.

Embedding request

Field

Type

Description

input

String or List[String]

Required. The input text to embed. Can be a string or a list of strings.

instruction

String

An optional instruction to pass to the embedding model.

Instructions are optional and highly model specific. For instance the The BGE authors recommend no instruction when indexing chunks and recommend using the instruction "Represent this sentence for searching relevant passages:" for retrieval queries. Other models like Instructor-XL support a wide range of instruction strings.

Embeddings response

Field

Type

Description

id

String

Unique identifier for the embedding.

object

String

The object type. Equal to "list".

model

String

The name of the embedding model used to create the embedding.

data

EmbeddingObject

The embedding object.

usage

Usage

Token usage metadata.

EmbeddingObject

Field

Type

Description

object

String

The object type. Equal to "embedding".

index

Integer

The index of the embedding in the list of embeddings generated by the model.

embedding

List[Float]

The embedding vector. Each model will return a fixed size vector (1024 for BGE-Large)