Speech to Text

CompactifAI’s speech-to-text capability delivers fast, reliable transcripts from common audio formats while staying fully compatible with the OpenAI Whisper API surface. The endpoint accepts multipart form uploads, normalizes outputs to JSON, and is ideal for meeting notes, support calls, or media captioning.

Basic Usage (Python)

import requests

API_URL = "https://api.compactif.ai/v1/audio/transcriptions"
API_KEY = "your_api_key_here"

headers = {
    "Authorization": f"Bearer {API_KEY}"
}

payload = {
    "model": "whisper-large-v3",
    "language": "en",
    "temperature": 0
}
file_name = "meeting_minutes.mp3"
file_content_type = "audio/mpeg"
with open(file_name, "rb") as audio_file:
    response = requests.post(API_URL, headers=headers, data=payload, files={"file": (file_name, audio_file, file_content_type)})

print(response.json()["text"])

Accepted Parameters

Field	Type	Description
`file`	file upload	Required audio file (`.mp3`, `.mp4`, `.mpeg`, `.mpga`, `.wav`, `.webm`)
`model`	string	Required model alias such as `whisper-large-v3`
`prompt`	string	Optional text primer to bias the transcription
`temperature`	number	Optional float between 0 and 1 (defaults to provider setting)
`language`	string	Optional ISO language code hint (`en` by default)
`response_format`	string	Accepted for compatibility; always returns JSON
`stream`	boolean	Accepted for compatibility but ignored; responses are non-streaming

Example Response

{
  "task": "transcribe",
  "language": "en",
  "duration": 12.6,
  "text": "Welcome to the quarterly planning meeting. Let's review the agenda.",
  "segments": [
    {"id": 0, "start": 0.0, "end": 7.5, "text": "Welcome to the quarterly planning meeting."},
    {"id": 1, "start": 7.5, "end": 12.6, "text": "Let's review the agenda."}
  ]
}

Tips

Keep uploads under 25 MB for best performance; large files benefit from client-side compression.
Provide the language hint when you know the spoken language to reduce warm-up time.
Use the same headers and authentication flow as other CompactifAI endpoints; only the form payload differs.