Supported File Types

This page details the audio and video formats supported by the transcription service. While the API generally refers to "audio files," both audio-only and video files with audio tracks are accepted.

Supported Containers and Codecs

The API supports a wide range of common audio and video containers and codecs. The following table lists the tested and verified combinations:

File Extension	Container	Audio Codec	Video Codec	Notes
`.mp3`	MP3	`libmp3lame`	N/A	Standard MP3 audio.
`.m4a`	MP4	AAC (Built-in)	N/A	AAC audio, typically using FFmpeg's built-in encoder.
`.opus`	Ogg	Opus	N/A	Opus audio, often using libopus.
`.flac`	FLAC	FLAC	N/A	Lossless audio.
`.wav`	WAV	PCM (`s16le`)	N/A	Uncompressed PCM audio, 16-bit signed little-endian.
`.mp4`	MP4	AAC	H.264	Video with AAC audio.
`.mp4`	MP4	AAC	H.265	Video with AAC audio.
`.webm`	WebM	Opus	VP9	Video with Opus audio.
`.webm`	WebM	Opus	AV1 (SVT-AV1)	Video with Opus audio, using the SVT-AV1 encoder.
`.ogg`	Ogg	Vorbis	Theora	Video with Vorbis audio.

Note: While other combinations of containers and codecs may work, the ones listed above are explicitly tested and guaranteed to be supported. If you encounter issues with a format not listed here, please contact support.

General Guidelines

The service attempts to automatically detect the container and codec of the input file.
For best results, use standard, widely-supported codecs such as standard MP3 audio.
If your file uses a less common codec, consider transcoding it to a more common format (like MP3 for audio or H.264/AAC in MP4 for video) before attempting to transcribe it.

Supported Containers and Codecs​

General Guidelines​

Supported Containers and Codecs

General Guidelines