Skip to main content

Supported File Types

This page details the audio and video formats supported by the transcription service. While the API generally refers to "audio files," both audio-only and video files with audio tracks are accepted.

Supported Containers and Codecs

The API supports a wide range of common audio and video containers and codecs. The following table lists the tested and verified combinations:

File ExtensionContainerAudio CodecVideo CodecNotes
.mp3MP3libmp3lameN/AStandard MP3 audio.
.m4aMP4AAC (Built-in)N/AAAC audio, typically using FFmpeg's built-in encoder.
.opusOggOpusN/AOpus audio, often using libopus.
.flacFLACFLACN/ALossless audio.
.wavWAVPCM (s16le)N/AUncompressed PCM audio, 16-bit signed little-endian.
.mp4MP4AACH.264Video with AAC audio.
.mp4MP4AACH.265Video with AAC audio.
.webmWebMOpusVP9Video with Opus audio.
.webmWebMOpusAV1 (SVT-AV1)Video with Opus audio, using the SVT-AV1 encoder.
.oggOggVorbisTheoraVideo with Vorbis audio.

Note: While other combinations of containers and codecs may work, the ones listed above are explicitly tested and guaranteed to be supported. If you encounter issues with a format not listed here, please contact support.

General Guidelines

  • The service attempts to automatically detect the container and codec of the input file.
  • For best results, use standard, widely-supported codecs such as standard MP3 audio.
  • If your file uses a less common codec, consider transcoding it to a more common format (like MP3 for audio or H.264/AAC in MP4 for video) before attempting to transcribe it.