Model ID
whisper-large-v3
Model ID
whisper-large-v3
Base Architecture
Whisper Large V3
| Specification | Value |
|---|---|
| Parameters | ~1.55B |
| Encoder-Decoder Layers | 32 encoder / 32 decoder |
| Audio Frontend | 80-channel log-Mel spectrogram (25 ms window, 10 ms stride) |
| Max Audio Duration | ~30 seconds per chunk (sliding window supported) |