Types
Data models and types for completion operations
The completion types used by any_llm.completion() and any_llm.acompletion() are re-exports from the OpenAI Python SDK, extended where needed to support additional fields like reasoning content.
Primary Types
ChatCompletion
ChatCompletionThe response object for a non-streaming completion request. Extends openai.types.chat.ChatCompletion with support for reasoning content in the message choices.
Import: from any_llm.types.completion import ChatCompletion
Key fields:
choices
list[Choice]
service_tier
str | None
ChatCompletionChunk
ChatCompletionChunkA single chunk in a streaming completion response. Extends openai.types.chat.ChatCompletionChunk.
Import: from any_llm.types.completion import ChatCompletionChunk
Key fields:
id
str
Completion identifier (same across all chunks).
choices
list[ChunkChoice]
Each chunk choice has a delta with incremental content, role, and optionally reasoning.
model
str
The model used.
ChatCompletionMessage
ChatCompletionMessageA message within a completion response. Extends openai.types.chat.ChatCompletionMessage with a reasoning field.
Import: from any_llm.types.completion import ChatCompletionMessage
role
str
Message role (e.g., "assistant").
content
str | None
Text content of the message.
reasoning
Reasoning | None
Reasoning/thinking content (when the model supports it).
tool_calls
list[ChatCompletionMessageToolCall] | None
Tool calls requested by the model.
annotations
list[dict] | None
Annotations attached to the message.
ParsedChatCompletion
ParsedChatCompletionReturned when response_format is a Pydantic BaseModel subclass or a dataclass type. Extends ChatCompletion with a generic type parameter.
Import: from any_llm import ParsedChatCompletion
Access the parsed object via response.choices[0].message.parsed, which will be an instance of the type passed as response_format.
CreateEmbeddingResponse
CreateEmbeddingResponseResponse object for embedding requests. Re-exported directly from openai.types.CreateEmbeddingResponse.
Import: from any_llm.types.completion import CreateEmbeddingResponse
data
list[Embedding]
List of embedding objects, each with an embedding vector and index.
model
str
The model used.
usage
Usage
Token usage with prompt_tokens and total_tokens.
ReasoningEffort
ReasoningEffortA literal type controlling reasoning depth for models that support it.
Import: from any_llm.types.completion import ReasoningEffort
The value "auto" (the default) maps to each provider's own default reasoning level.
Internal Types
CompletionParams
CompletionParamsNormalized parameters for chat completions, used internally to pass structured parameters from the public API to provider implementations.
Import: from any_llm.types.completion import CompletionParams
model_id
str
Model identifier (e.g., 'mistral-small-latest')
messages
list[dict[str, Any]]
List of messages for the conversation
tools
list[dict[str, Any] | Any] | None
List of tools for tool calling. Should be converted to OpenAI tool format dicts
tool_choice
str | dict[str, Any] | None
Controls which tools the model can call
temperature
float | None
Controls randomness in the response (0.0 to 2.0)
top_p
float | None
Controls diversity via nucleus sampling (0.0 to 1.0)
max_tokens
int | None
Maximum number of tokens to generate
response_format
dict[str, Any] | type | None
Format specification for the response. Accepts Pydantic BaseModel subclasses, dataclass types, or dicts.
stream
bool | None
Whether to stream the response
n
int | None
Number of completions to generate
stop
str | list[str] | None
Stop sequences for generation
presence_penalty
float | None
Penalize new tokens based on presence in text
frequency_penalty
float | None
Penalize new tokens based on frequency in text
seed
int | None
Random seed for reproducible results
user
str | None
Unique identifier for the end user
parallel_tool_calls
bool | None
Whether to allow parallel tool calls
logprobs
bool | None
Include token-level log probabilities in the response
top_logprobs
int | None
Number of top alternatives to return when logprobs are requested
logit_bias
dict[str, float] | None
Bias the likelihood of specified tokens during generation
stream_options
dict[str, Any] | None
Additional options controlling streaming behavior
max_completion_tokens
int | None
Maximum number of tokens for the completion (provider-dependent)
reasoning_effort
Literal['none', 'minimal', 'low', 'medium', 'high', 'xhigh', 'max', 'auto'] | None
Additional Re-exports
The following types are also available from any_llm.types.completion:
CompletionUsage
openai.types.CompletionUsage
Token usage counts.
Function
openai.types.chat
Function definition within a tool call.
Embedding
openai.types.Embedding
Single embedding vector with index.
ChoiceDeltaToolCall
openai.types.chat
Tool call delta in streaming chunks.
For full field-level documentation of the base OpenAI types, see the OpenAI Python SDK reference.
Last updated