Chat Completion

CompactifAI API’s chat completion endpoint enables you to create dynamic, multi-turn conversations with our advanced compressed language models, offering exceptional performance at significantly reduced costs.

Basic Usage

import requests

API_URL = "https://api.compactif.ai/v1/chat/completions"
API_KEY = "your_api_key_here"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

data = {
    "model": "cai-llama-4-scout-slim",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me about quantum computing."}
    ],
    "temperature": 0.7
}

response = requests.post(API_URL, headers=headers, json=data)
print(response.json()["choices"][0]["message"]["content"])

Fields

Field	Type	Description
`model`	string	ID of the compressed model to use
`messages`	array	Array of message objects
`temperature`	number	Controls randomness (0-2)
`top_p`	number	Controls diversity via nucleus sampling
`n`	integer	Number of completions to generate
`max_tokens`	integer	Maximum number of tokens to generate
`stream`	boolean	Whether to stream back partial progress
`stop`	string or array	Sequences where the API will stop generating

Message Format

Messages must be an array of objects with the following structure:

{
  "role": "system" | "user" | "assistant",
  "content": "The message content"
}

Please refer to our API Reference or the OpenAI API reference for more information on the fields.