Skip to content

OpenAI API Compatibility (Beta)

CompactifAI API provides OpenAI-compatible endpoints that allow you to use existing applications and libraries that work with OpenAI’s API with minimal changes. This compatibility layer makes it easy to quickly evaluate CompactifAI models, switching from other providers with just a few code changes.

  1. Replace OpenAI API key with your CompactifAI key
  2. Update base URL to your CompactifAI endpoint
  3. Use CompactifAI model names (e.g., “cai-llama-3-1-8b-slim”)
from openai import OpenAI
# Initialize the client with your CompactifAI API endpoint
client = OpenAI(
api_key="your-compactifai-api-key", # CompactifAI API key
base_url="https://your-compactifai-api-endpoint/v1" # Replace with your endpoint
)
# Chat completions
chat_completion = client.chat.completions.create(
model="cai-llama-3-1-8b-slim", # Use any one of the available CompactifAI model
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, what can you tell me about CompactifAI?"}
],
temperature=0.7,
max_tokens=256
)
print(chat_completion.choices[0].message.content)
# Streaming chat completions
stream = client.chat.completions.create(
model="cai-llama-3-1-8b-slim",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me about artificial intelligence"}
],
temperature=0.7,
max_tokens=256,
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
EndpointDescription
/v1/chat/completionsChat conversations
/v1/completionsText completions
/v1/modelsList available models
{
"model": "cai-llama-3-1-8b-slim",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, I'm Camillo"},
{"role": "user", "content": "What is my name? What is the capital of Colombia?"}
],
"temperature": 0.7,
"max_tokens": 128,
"stop": ["###"],
"n": 1,
"user": "user-123"
}
{
"model": "cai-llama-3-1-8b-slim",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me about artificial intelligence"}
],
"temperature": 0.7,
"max_tokens": 128,
"stream": true,
"user": "user-123"
}
{
"model": "cai-llama-3-1-8b-slim",
"prompt": "What is the capital of France?",
"temperature": 0.7,
"max_tokens": 128,
"stop": ["###"],
"user": "user-123"
}

Responses are structured to be compatible with OpenAI’s format:

{
"id": "6a172d30ce8e4f34b4b830f8347c3911",
"created":1749600000,
"model": "cai-llama-3-1-8b-slim",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello Camillo. It's nice to meet you.\n\nYour name is Camillo.\n\nThe capital of Colombia is Bogotá."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 35,
"total_tokens": 60
}
}
{
"id": "1861edc39ce648e3862a0b6ae9b7687b",
"object": "text_completion",
"created":1749600000,
"model": "cai-llama-3-1-8b-slim",
"choices": [
{
"text": "The capital of France is Paris.",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 7,
"total_tokens": 12
}
}

The CompactifAI OpenAI compatibility layer supports the following endpoints:

FieldSupport Status
modelFull (use CompactifAI model names)
messagesFull (system, user, assistant roles)
max_tokensFull
temperatureFull
top_pFull
nPartial (must be 1)
streamFull
stopFull
userFull
presence_penaltyFull
frequency_penaltyFull
logit_biasIgnored
logprobsIgnored
top_logprobsIgnored
seedIgnored
toolsIgnored
tool_choiceIgnored
function_callIgnored (deprecated)
functionsIgnored (deprecated)
parallel_tool_callsIgnored
response_formatIgnored
max_completion_tokensIgnored
data_sourcesIgnored (Azure-specific)
FieldSupport Status
idFull
createdFull
modelFull
objectFull
choicesFull
usageFull

For more details, see our API reference or have a look at the OpenAI chat completions documentation.

FieldSupport Status
modelFull (use CompactifAI model names)
promptFull
max_tokensFull
temperatureFull
top_pFull
stopFull
userFull
best_ofIgnored
echoIgnored
logit_biasIgnored
logprobsIgnored
seedIgnored
suffixIgnored
FieldSupport Status
idFull
createdFull
modelFull
objectFull
choicesFull
usageFull

For more details, see our API reference or have a look at the OpenAI text completions documentation.

  • Authentication: Uses CompactifAI’s authentication
  • Models: Different model names (e.g., “cai-llama-3-1-8b-slim” instead of “gpt-4”)
  • Endpoint fields: Some OpenAI-specific fields not supported

Tested with:

For other SDKs and libraries that are built to work with OpenAI’s API, you should be able to use them by changing the base URL to point to your CompactifAI API endpoint.

Python
from openai import OpenAI
client = OpenAI(
api_key='your-compactifai-api-key',
base_url='https://your-compactifai-api-endpoint/v1',
)
def main():
# Regular completion
completion = client.chat.completions.create(
model='cai-llama-3-1-8b-slim',
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Tell me about CompactifAI.'}
],
temperature=0.7,
max_tokens=256
)
print(completion.choices[0].message.content)
# Streaming completion
stream = client.chat.completions.create(
model='cai-llama-3-1-8b-slim',
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Tell me about artificial intelligence.'}
],
temperature=0.7,
max_tokens=256,
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
if __name__ == "__main__":
main()

CompactifAI API maintains consistent error formats with the OpenAI API. However, the detailed error messages may differ. We recommend using the error messages primarily for logging and debugging purposes.