Skip to main content

Create a New Transcription Job

This endpoint allows you to create a new synchronous transcription job by uploading an audio file directly. The Edge v2 API is highly optimized for short audio files and runs on the edge.


Creating a Job via File Upload

This API accepts standard multipart/form-data. Because the Edge v2 API is synchronous, the HTTP connection remains open until the file is fully verified, processed, and transcribed.

Endpoint

POST https://edge.speechischeap.com/v2/jobs/

Authentication

Requires authentication using a Bearer token in the HTTP header:

Authorization: Bearer YOUR_API_KEY

Request Body

The request body must be sent as multipart/form-data.

Parameters

ParameterTypeDefaultDescription
input_filefile-The audio file to transcribe (required). There is currently no file size or duration limit.
can_parse_wordsbooleanfalseWhen enabled, includes a timecode for every word in the transcription. See add-ons for pricing.
is_privatebooleanfalseWhen enabled, returns the transcriptions with a redacted file name and saves the redacted job output instead of the original file.
languagestring""Two-letter ISO 639-1 language code (e.g., "en"). If not provided, the language will be auto-detected from the first segment.
minimum_confidencenumber0.5Filter out segments that fall below this confidence threshold. Must be between 0.0 and 1.0.

Example Request

curl --location 'https://edge.speechischeap.com/v2/jobs/' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--form 'input_file=@"/path/to/your/audio-file.mp3"' \
--form 'can_parse_words="true"' \
--form 'language="en"' \
--form 'minimum_confidence="0.5"' \
--form 'is_private="true"'

Response

The API uses standard HTTP status codes to indicate the outcome of your request.

Status CodeDescription
200 OKThe job was successfully created and transcribed
400 Bad RequestMissing or invalid parameters
401 UnauthorizedMissing or invalid authentication token
500 Internal Server ErrorServer-side error occurred

Success Response

When the job completes successfully, the response will be a JSON object containing the result:

{
"id": "00000000-1111-7222-b333-444444444444-sic",
"status": "COMPLETED",
"output": {
"duration": 12.345,
"request": {
"can_parse_words": true,
"is_private": false,
"input_file": "mic-test.wav",
"language": "en",
"minimum_confidence": 0.5
},
"segments": [
{
"confidence": 0.987,
"end": 12.345,
"id": 1,
"language": "en (99.95%)",
"processing_duration_in_s": 0.321,
"start": 1.234,
"text": "This is an example of some transcribed text output.",
"words": [
{
"start": 1.234,
"end": 1.456,
"text": "This"
},
...
]
}
]
}
}

Error Responses

Error responses may be in plain text or JSON format depending on the exact error.

Invalid form inputs missing parameters (Plain Text):

HTTP/1.1 400 Bad Request
Content-Type: text/plain

Failed to parse request body. Please use multipart/form-data.

Missing input_file (Plain Text):

HTTP/1.1 400 Bad Request
Content-Type: text/plain

Missing or invalid "input_file". Please upload an audio file.

Invalid audio format (JSON):

{
"id": "00000000-1111-7222-b333-444444444444-sic",
"status": "FAILED",
"output": {
"error": "Failed to decode audio file. Ensure it is a valid audio format."
}
}

General Information

Add-ons Responses

The output may include additional values depending on the add-ons used.

Parse Words

Returns an array of words within each segment. Includes the start and end timestamps and text contents:

{
"words": [
{
"start": 2.345,
"end": 2.567,
"text": "hi"
}
]
}

Notes

  • For best results, omit all optional parameters and let the system auto-configure based on the input
  • When language is not provided, the system will auto-detect the audio language of each segment
  • Missing segments indicate that some of them fell below the minimum_confidence threshold
  • You are only charged for successfully completed transcriptions