Completion
Chat Completion
Generate text completions using AI models (OpenAI Compatible)
POST
Generate natural language or code completions based on a list of messages. This endpoint is compatible with the OpenAI Chat Completions API.
Headers
Your Apollo AI API key. Alternatively, use
Authorization: Bearer <token>.Body Parameters
A list of messages comprising the conversation so far.
The ID of the model to use (e.g.,
gpt-oss-120b, llama3).
See Models for a full list.If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available.
The maximum number of tokens to generate in the chat completion.
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
A list of tools the model may call. Currently, only functions are supported as a tool.
Response
A unique identifier for the chat completion.
A list of chat completion choices.
The Unix timestamp (in seconds) of when the chat completion was created.
The model used for the chat completion.
Usage statistics for the completion request.