Chat Completion
CompactifAI API’s chat completion endpoint enables you to create dynamic, multi-turn conversations with our advanced compressed language models, offering exceptional performance at significantly reduced costs.
Basic Usage
Section titled “Basic Usage”import requests
API_URL = "https://api.compactif.ai/v1/chat/completions"API_KEY = "your_api_key_here"
headers = { "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
data = { "model": "cai-llama-4-scout-slim", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about quantum computing."} ], "temperature": 0.7}
response = requests.post(API_URL, headers=headers, json=data)print(response.json()["choices"][0]["message"]["content"])
Fields
Section titled “Fields”Field | Type | Description |
---|---|---|
model | string | ID of the compressed model to use |
messages | array | Array of message objects |
temperature | number | Controls randomness (0-2) |
top_p | number | Controls diversity via nucleus sampling |
n | integer | Number of completions to generate |
max_tokens | integer | Maximum number of tokens to generate |
stream | boolean | Whether to stream back partial progress |
stop | string or array | Sequences where the API will stop generating |
Message Format
Section titled “Message Format”Messages must be an array of objects with the following structure:
{ "role": "system" | "user" | "assistant", "content": "The message content"}
Please refer to our API Reference or the OpenAI API reference for more information on the fields.