Skip to content

Chat Completion

CompactifAI API’s chat completion endpoint enables you to create dynamic, multi-turn conversations with our advanced compressed language models, offering exceptional performance at significantly reduced costs.

import requests
API_URL = "https://api.compactif.ai/v1/chat/completions"
API_KEY = "your_api_key_here"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
data = {
"model": "cai-llama-4-scout-slim",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me about quantum computing."}
],
"temperature": 0.7
}
response = requests.post(API_URL, headers=headers, json=data)
print(response.json()["choices"][0]["message"]["content"])
FieldTypeDescription
modelstringID of the compressed model to use
messagesarrayArray of message objects
temperaturenumberControls randomness (0-2)
top_pnumberControls diversity via nucleus sampling
nintegerNumber of completions to generate
max_tokensintegerMaximum number of tokens to generate
streambooleanWhether to stream back partial progress
stopstring or arraySequences where the API will stop generating

Messages must be an array of objects with the following structure:

{
"role": "system" | "user" | "assistant",
"content": "The message content"
}

Please refer to our API Reference or the OpenAI API reference for more information on the fields.