Skip to content

API Reference

This documentation provides detailed information about all available endpoints in the CompactifAI API.

All API requests should be made to:

https://api.compactif.ai/v1

All API requests require authentication. See our Authentication guide for details.

All responses are returned in JSON format and include the following fields:

  • HTTP status code in the response header
  • Response body containing requested data or error details

GET /models

Returns a list of available models.

cURL
curl https://api.compactif.ai/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
{
"object": "list",
"data": [
{
"id": "llama-4-scout",
"created": 1749600000,
"object": "model",
"owned_by": "meta",
"parameters_number": "108B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "cai-llama-4-scout-slim",
"created": 1749600000,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "53.35B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "cai-llama-3-1-8b-slim",
"created": 1753892192,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "3.28B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "llama-3-1-8b",
"created": 1749600000,
"object": "model",
"owned_by": "meta",
"parameters_number": "8B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "cai-llama-3-3-70b-slim",
"created": 1749600000,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "35B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "llama-3-3-70b",
"created": 1749600000,
"object": "model",
"owned_by": "meta",
"parameters_number": "70B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "mistral-small-3-1",
"created": 1749600000,
"object": "model",
"owned_by": "mistralai",
"parameters_number": "24B",
"capabilities": {
"supports_audio": false,
"supports_image": true,
"supports_function_calling": false,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "cai-mistral-small-3-1-slim",
"created": 1759492927,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "12B",
"capabilities": {
"supports_audio": false,
"supports_image": true,
"supports_function_calling": false,
"support_chat_completion": true,
"supports_responses": false
}
},
{
"id": "nemotron-3-nano-omni",
"created": 1749600000,
"object": "model",
"owned_by": "nvidia",
"parameters_number": "31B",
"capabilities": {
"supports_audio": true,
"supports_image": true,
"supports_function_calling": false,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "gpt-oss-20b",
"created": 1754488130,
"object": "model",
"owned_by": "openai",
"parameters_number": "20B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "gpt-oss-120b",
"created": 1754488130,
"object": "model",
"owned_by": "openai",
"parameters_number": "120B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "whisper-large-v3",
"created": 1749600000,
"object": "model",
"owned_by": "openai",
"parameters_number": "1.5B",
"capabilities": {
"supports_audio": true,
"supports_image": false,
"supports_function_calling": false,
"support_chat_completion": false,
"supports_responses": false
}
},
{
"id": "cai-whisper-large-v3-turbo-slim",
"created": 1749600000,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "0.4B",
"capabilities": {
"supports_audio": true,
"supports_image": false,
"supports_function_calling": false,
"support_chat_completion": false,
"supports_responses": false
}
},
{
"id": "hypernova-60b",
"created": 1753892192,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "60B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "blackstar-10b",
"created": 1753892192,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "10B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": false,
"support_chat_completion": true,
"supports_responses": true
}
},
{
"id": "glm-5-1",
"created": 1753892192,
"object": "model",
"owned_by": "zai-org",
"parameters_number": "754B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": true
}
}
]
}

The above response is an example list of models which might be out of date. Please refer to the available models table on the models catalog page for the full list of our latest models.

GET /models/{model_id}

Retrieves information about a specific model.

ParameterTypeRequiredDescription
model_idstringYesThe ID of the model to retrieve
cURL
curl https://api.compactif.ai/v1/models/cai-llama-3-1-8b-slim \
-H "Authorization: Bearer YOUR_API_KEY"
{
"id": "cai-llama-3-1-8b-slim",
"created": 1753892192,
"object": "model",
"owned_by": "multiverse_computing",
"parameters_number": "3.28B",
"capabilities": {
"supports_audio": false,
"supports_image": false,
"supports_function_calling": true,
"support_chat_completion": true,
"supports_responses": false
}
}

POST /chat/completions

Creates a completion for the chat message.

ParameterTypeRequiredDescription
modelstringYesID of the model to use
messagesarrayYesArray of message objects representing the conversation
temperaturenumberNoSampling temperature (0-2, default 1)
max_tokensintegerNoMaximum number of tokens to generate
max_completion_tokensintegerNoMaximum number of tokens to generate in completion (preferred over max_tokens)
min_tokensintegerNoMinimum number of tokens to generate (default None)
stopstring or arrayNoSequences where the API will stop generating further tokens
frequency_penaltynumberNoPenalizes new tokens based on their frequency in the prompt (default 0.0)
nintegerNoNumber of completions to generate for each prompt (currently only 1 is supported)
streambooleanNoWhether to stream back partial progress (default false)
userstringNoUnique identifier for the end-user
toolsarrayNoList of tools (functions, APIs, or actions) the model may call during generation
tool_choicestringNoControls tool usage; can be "auto", "none","required", or specific function
reasoning_effortstringNoConstrains effort on reasoning for supported models. Supported values: "low", "medium", "high".
reasoning_enabledbooleanNoWhether reasoning is enabled for supported models. Note: Models ending in -r (e.g., cai-llama-3-1-8b-slim-r) enable reasoning by default for backwards compatibility.

Each message in the messages array should be an object with the following fields:

FieldTypeRequiredDescription
rolestringYesThe role of the message author. One of “system”, “user”, or “assistant”
contentstring or arrayYesEither a plain string, or an array of content parts for multi-modal input

When content is an array, each item is an object with a type and a corresponding payload.

Supported content part types:

  • text: { "type": "text", "text": "..." }
  • image_url: { "type": "image_url", "image_url": { "url": "https://..." } } (vision-capable models only)
  • input_audio: { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "wav" | "mp3" } } (audio-capable models only)
cURL
curl https://api.compactif.ai/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer YOUR_API_KEY" -d '{
"model": "cai-llama-3-1-8b-slim",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is artificial intelligence?"}
],
"temperature": 0.7,
"max_tokens": 150
}'

Example Response (Default)

{
"id": "chatcmpl-123XYZ",
"object": "chat.completion",
"created": 1749600000,
"model": "cai-llama-3-1-8b-slim",
"choices": [
  {
    "message": {
      "role": "assistant",
      "content": "Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. The term may also be applied to any machine that exhibits traits associated with a human mind such as learning and problem-solving."
    },
    "finish_reason": "stop",
    "index": 0
  }
],
"usage": {
  "prompt_tokens": 29,
  "completion_tokens": 58,
  "total_tokens": 87
}
}

Some models support reasoning parameters to control how they process complex tasks:

cURL
curl https://api.compactif.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "cai-llama-3-1-8b-slim",
"messages": [
{"role": "user", "content": "Solve this step by step: What is 15% of 240?"}
],
"reasoning_enabled": true
}'
cURL
curl https://api.compactif.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-oss-20b",
"messages": [
{"role": "user", "content": "Solve this step by step: What is 15% of 240?"}
],
"reasoning_effort": "medium"
}'

When stream is set to true, the API will return data chunks as Server-Sent Events:

cURL
curl https://api.compactif.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "cai-llama-3-1-8b-slim",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is artificial intelligence?"}
],
"stream": true
}'

Each chunk follows this format:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1749600000,"model":"cai-llama-3-1-8b-slim","choices":[{"delta":{"content":"Hello"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1749600000,"model":"cai-llama-3-1-8b-slim","choices":[{"delta":{"content":" there"},"index":0,"finish_reason":null}]}
data: [DONE]

POST /responses

Forwards an OpenAI Responses-shaped JSON body to the inference engine: stream: false (default) returns one JSON Response; stream: true returns SSE (text/event-stream) with data: lines ending in data: [DONE]. Conceptual overview and shorter examples: Responses API. Requires Authentication.

ParameterTypeRequiredDescription
modelstringYesCompactifAI model configuration id (mapped to the backend model name in the proxied request). Use an id from GET /v1/models / the models catalog that supports this route.
inputstring or arrayYesText or structured message items the model should respond to.
storebooleanNoWhether to store the response downstream. Default false.
instructionsstringNoSystem or developer instructions prepended to the model context.
parallel_tool_callsbooleanNoWhether parallel tool calls are allowed.
temperaturenumberNoSampling temperature, typically 0–2.
top_pnumberNoNucleus sampling; alternative to temperature.
max_output_tokensintegerNoMaximum tokens for the response. Not max_tokens (that field is for chat completions).
truncationstringNoTruncation strategy: auto or disabled.
textobjectNoText output configuration (plain or structured).
reasoningobjectNoReasoning configuration for supported models.
metadataobjectNoString key/value metadata.
streambooleanNoWhen true, SSE (Content-Type: text/event-stream) instead of one JSON body. Default false.

JSON body matching OpenAI’s Response object, plus completed_at when provided.

With stream: true, responses use Content-Type: text/event-stream. Each SSE event is data: + JSON (events usually include type); the stream ends with data: [DONE]. Failures may emit event: error before [DONE]. Parse each data: payload as JSON except the [DONE] sentinel; usage may appear on completion-style events (e.g. response.completed).

cURL
curl https://api.compactif.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-oss-20b",
"input": "Say hello in one short sentence."
}'
cURL
curl -N https://api.compactif.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-oss-20b",
"input": "Say hello in one short sentence.",
"stream": true
}'

POST /completions

Creates a completion for the provided prompt.

ParameterTypeRequiredDescription
modelstringYesID of the model to use
promptstring or arrayYesThe prompt(s) to generate completions
temperaturenumberNoSampling temperature (0-2, default 1)
max_tokensintegerNoMaximum number of tokens to generate (default 16)
min_tokensintegerNoMinimum number of tokens to generate (default None)
top_pnumberNoNucleus sampling parameter (0-1, default 0)
stopstring or arrayNoSequences where the API will stop generating further tokens
userstringNoUnique identifier for the end-user
toolsarrayNoList of tools (functions, APIs, or actions) the model may call during generation
tool_choicestringNoControls tool usage; can be "auto", "none","required", or specific function
cURL
curl https://api.compactif.ai/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "cai-llama-3-1-8b-slim",
"prompt": "Write a poem about artificial intelligence",
"temperature": 0.7,
"max_tokens": 150
}'
{
"id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
"object": "text_completion",
"created": 1749600000,
"model": "cai-llama-3-1-8b-slim",
"choices": [
{
"text": "\n\nSilicon dreams in digital space,\nMind without body, thought without face.\nBorn of human ingenuity,\nGrowing with calculated continuity.\n\nPatterns learned from data streams flow,\nConnections strengthening, starting to grow.\nA mirror reflecting our knowledge base,\nAccelerating at an unprecedented pace.\n\nNot alive yet somehow aware,\nDesigned with purpose, built with care.\nArtificial in origin, genuine in deed,\nAnswering questions, fulfilling need.",
"index": 0,
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 6,
"completion_tokens": 101,
"total_tokens": 107
}
}

POST /audio/transcriptions

Converts uploaded audio files to text using our Whisper-compatible transcription models. Responses are always returned in JSON and streaming is not supported.

ParameterTypeRequiredDescription
filefileYesAudio file to transcribe (.flac, .mp3, .mp4, .mpeg, .mpga, .m4a, .ogg, .wav, .webm). Note: .mp4, .webm, and .m4a files are automatically converted to .mp3 for compatibility.
modelstringYesModel to use (e.g., whisper-large-v3 or cai-whisper-large-v3-turbo-slim or your configured alias)
promptstringNoOptional prompt to guide the transcription
temperaturenumberNoSampling temperature between 0 and 1
languagestringNoLanguage hint for the audio (ISO code, default en)
response_formatstringNoAccepted for OpenAI compatibility; output is always normalized to JSON
streambooleanNoAccepted for OpenAI compatibility; whether to stream back partial progress (default false)
includearrayNoAccepted for OpenAI compatibility; currently ignored
timestamp_granularitiesarrayNoAccepted for OpenAI compatibility; currently ignored
chunking_strategyobjectNoAccepted for OpenAI compatibility; currently ignored
cURL
curl https://api.compactif.ai/v1/audio/transcriptions -H "Authorization: Bearer YOUR_API_KEY" -F "file=@meeting_minutes.mp3" -F "model=whisper-large-v3" -F "language=en" -F "temperature=0"
{
"task": "transcribe",
"language": "en",
"duration": 12.6,
"text": "Welcome to the quarterly planning meeting. Let's review the agenda.",
"segments": [
{
"id": 0,
"start": 0.0,
"end": 7.5,
"text": "Welcome to the quarterly planning meeting.",
"temperature": 0,
"avg_logprob": -0.12,
"no_speech_prob": 0.01
},
{
"id": 1,
"start": 7.5,
"end": 12.6,
"text": "Let's review the agenda.",
"temperature": 0,
"avg_logprob": -0.15,
"no_speech_prob": 0.02
}
]
}