Create a New Transcription Job

This endpoint allows you to create a new transcription job by providing an audio file URL, a webhook callback URL, and optional configuration parameters.

Endpoint

POST https://api.speechischeap.com/v2/jobs/

Authentication

Requires authentication using a Bearer token in the HTTP header:

Authorization: Bearer YOUR_API_KEY

Request Body

The request body should be a JSON object with the following parameters:

Parameters

Parameter	Type	Default	Description
`input_url`	string	-	The URL of the audio file to transcribe (required). Must be between six seconds and 24 hours long.
`can_label_audio`	boolean	`false`	When enabled, includes an audio classification label in the transcription. See add-ons for pricing.
`can_parse_speakers`	boolean	`false`	When enabled, adds `speaker_id` to each segment based on the speaker's voice. See add-ons for pricing.
`can_parse_words`	boolean	`false`	When enabled, includes a timecode for every word in the transcription. See add-ons for pricing.
`hotwords`	string	`""`	Specific words or phrases to help improve transcription accuracy.
`is_private`	boolean	`false`	When enabled, redacts the original `input_url` for privacy.
`language`	string	`""`	Two-letter ISO 639-1 language code (e.g., "en"). If not provided, the language will be auto-detected from the first segment.
`minimum_confidence`	number	`0.5`	Filter out segments that fall below this confidence threshold. Applies both to transcriptions and to non-speech audio labels when `can_label_audio` is `true`.
`prompt`	string	`""`	Custom prompt to adjust transcription style. Should match the audio language.
`segment_duration`	number	`30`	Duration of each transcription segment in seconds. Must be between six and 30 seconds.
`user_agent`	string	[internal]	Custom user agent header to fetch the audio file.
`webhook_url`	string	`""`	The URL where the transcription results may be sent to via a POST request.

Example Request

Using cURL

curl --request POST \
     --url https://api.speechischeap.com/v2/jobs/ \
     --header 'Authorization: Bearer YOUR_API_KEY' \
     --header 'Content-Type: application/json' \
     --data '{
       "input_url": "https://example.com/audio-file.mp3",
       "webhook_url": "https://your-domain.com/webhook"
     }'

Including all optional parameters

{
  "input_url": "https://example.com/audio-file.mp3",
  "webhook_url": "https://your-domain.com/webhook",
  "can_label_audio": true,
  "can_parse_speakers": true,
  "can_parse_words": true,
  "hotwords": "AI, ML, Neural Networks",
  "language": "en",
  "minimum_confidence": 0,
  "prompt": "This is a technical discussion about artificial intelligence",
  "segment_duration": 15
}

Response

Synchronous Job Responses

The API uses standard HTTP status codes to indicate the outcome of your request:

Status Code	Description
202 Accepted	The job was successfully created and is being processed
400 Bad Request	Missing or invalid parameters
401 Unauthorized	Invalid or missing authentication token
500 Internal Server Error	Server-side error occurred

New Job Response

When a new transcription job is created successfully, you'll receive a 202 Accepted status code with the following response:

{
  "id": "00000000-1111-7222-b333-444444444444-sic",
  "output": {},
  "status": "PENDING"
}

Error Responses

Error responses include a message explaining what went wrong. For example:

HTTP/1.1 400 Bad Request
Content-Type: text/plain

Missing required "input_url" parameter

Asynchronous Webhook Responses

Success

When the job completes successfully, the webhook will receive a completion response similar to the following:

{
  "id": "00000000-1111-7222-b333-444444444444-sic",
  "output": {
    "segments": [
      {
        "confidence": 0.987,
        "end": 12.345,
        "id": 1,
        "language": "en (99.95%)",
        "processing_duration_in_s": 0.321,
        "seek": 1234.5,
        "start": 1.234,
        "text": "This is an example of some transcribed text output.",
        "words": null
      }
    ]
  },
  "status": "COMPLETED"
}

Failure

If the job fails to complete, the webhook will receive the following error response:

{
  "id": "00000000-1111-7222-b333-444444444444-sic",
  "output": {
    "error": "Some error message"
  },
  "status": "FAILED"
}

Cancelation

If the job is canceled, the webhook will receive the following confirmation response:

{
  "id": "00000000-1111-7222-b333-444444444444-sic",
  "output": {},
  "status": "CANCELED"
}

Add-ons Responses

The output may include additional values depending on the add-ons used.

Parse Speakers

Returns one segment per speaker including the speaker_id string for each segment. May be empty ("") when used together with the Label Audio add-on if the segment has no speech:

{ "speaker_id": "A" }

Parse Words

Returns an array of words within each segment. Includes the start and end timestamps and text contents:

{
  "words": [
    {
      "start": 2.345,
      "end": 2.567,
      "text": "hi"
    }
  ]
}

Label Audio

Returns the audio classification label for each segment:

{ "label": "music" }

Notes

The input_url audio file must be publicly accessible
For best results, omit all optional parameters and let the system auto-configure based on the input
When language is not provided, the system will auto-detect the audio language of each segment
Missing segments indicate that some of them fell below the minimum_confidence threshold
The confidence value in non-speech segments refers to the confidence of the audio classifier
Enabling is_private may limit our ability to troubleshoot since input_url will be masked
You are only charged for successfully completed transcriptions

Endpoint​

Authentication​

Request Body​

Parameters​

Example Request​

Using cURL​

Including all optional parameters​

Response​

Synchronous Job Responses​

New Job Response​

Error Responses​

Asynchronous Webhook Responses​

Success​

Failure​

Cancelation​

Add-ons Responses​

Parse Speakers​

Parse Words​

Label Audio​

Notes​