This is the documentation for a previous version of our product. Click here to see the latest version.

Speechmatics ASR REST API

Overview 3.5.0

The Speechmatics Automatic Speech Recognition REST API is used to submit ASR jobs and receive the results. The supported job types are transcription of audio files, and alignment of audio files with existing transcripts to add word or line timings.

Version information

Version: 1.1.0

Contact information

Contact Email: support@speechmatics.com

License information

Terms of service : https://www.speechmatics.com/terms-and-conditions/

URI scheme

BasePath : /v1 Schemes : HTTP

Paths

The base URL http://${APPLIANCE_HOST}:8082/v1/user/1/ is used for REST Speech API requests. If you are using the secure Speech API, the base URL is: https://${APPLIANCE_HOST}/v1/user/1/. The user ID component can be any positive integer; by convention we use 1 for all requests.

/jobs

Requests without a job ID component are used to create a new job, or to return a list of all submitted jobs.

POST

Summary: Create a new job.

Parameters

NameLocated inDescriptionRequiredSchema
configformDataJSON containing a JobConfig model indicating the type and parameters for the recognition job.Yesstring
data_fileformDataThe data file to be processed. Alternatively the data file can be fetched from a url specified in JobConfig.Nofile

Responses

CodeDescriptionSchema
201OKCreateJobResponse
400Bad requestErrorResponse
401UnauthorizedErrorResponse
403ForbiddenErrorResponse
500Internal Server ErrorErrorResponse
GET

Summary: List all jobs.

Responses

CodeDescriptionSchema
200OKRetrieveJobsResponse
401UnauthorizedErrorResponse
500Internal Server ErrorErrorResponse

/jobs/{jobid}

Requests with a job ID component are used to view the status, transcript or audio data for a job, or remove a given job from the system.

GET

Summary: Get job details, including progress and any error reports.

Parameters

NameLocated inDescriptionRequiredSchema
jobidpathID of the job.Yesstring

Responses

CodeDescriptionSchema
200OKRetrieveJobResponse
401UnauthorizedErrorResponse
404Not foundErrorResponse
500Internal Server ErrorErrorResponse
DELETE

Summary: Delete a job and remove all associated resources.

Parameters

NameLocated inDescriptionRequiredSchema
jobidpathID of the job to delete.Yesstring

Responses

CodeDescriptionSchema
200The job that was deleted.DeleteJobResponse
401UnauthorizedErrorResponse
404Not foundErrorResponse
500Internal Server ErrorErrorResponse

/jobs/{jobid}/data

GET

Summary: Get the data file used as input to a job.

Parameters

NameLocated inDescriptionRequiredSchema
jobidpathID of the job.Yesstring

Responses

CodeDescriptionSchema
200OKfile
401UnauthorizedErrorResponse
404Not foundErrorResponse
500Internal Server ErrorErrorResponse

/jobs/{jobid}/transcript

GET

Summary: Get the transcript for a transcription job.

Parameters

NameLocated inDescriptionRequiredSchema
jobidpathID of the job.Yesstring
formatqueryThe transcripton format (by default the json-v2 format is returned).Nostring

Responses

CodeDescriptionSchema
200OKRetrieveTranscriptResponse
401UnauthorizedErrorResponse
404Not foundErrorResponse
500Internal Server ErrorErrorResponse

Models

ErrorResponse

NameTypeDescriptionRequired
codeintegerThe HTTP status code.Yes
errorstringThe error message.Yes
detailstringThe details of the error.No

TranscriptionConfig

NameTypeDescriptionRequired
languagestringLanguage model to process the audio input, normally specified as an ISO language codeYes
additional_vocab[object]List of custom words or phrases that should be recognized. Alternative pronunciations can be specified to aid recognition.No
punctuation_overrides[object]Control punctuation settings.No
diarizationstringSpecify whether speaker or channel labels are added to the transcript. The default is none.No
channel_diarization_labels[string]Transcript labels to use when using collating separate input channels.No

For the diarization parameter, the following values are valid:

ValueDescription
noneno speaker or channel labels are added.
speakerspeaker attribution is performed based on acoustic matching; all input channels are mixed into a single stream for processing.
channelmultiple input channels are processed individually and collated into a single transcript.
speaker_changethe output indicates when the speaker in the audio changes. No speaker attribution is performed. This is a faster method than speaker. The reported speaker changes may not agree with speaker.
channel_and_speaker_changeboth channel and speaker_change are switched on. The speaker change is indicated if more than one speaker are recorded in one channel.

OutputConfig

If you want the transcription output to be in the SubRip Title (SRT) format, and you want to alter the default parameters Speechmatics provides you must provide the output_confiog within the config object

output_config

NameTypeDescriptionRequired
srt_overridesobjectParameters to override the default parameters for SubRip (srt) subtitle format. - max_line_length: sets maximum count of characters per subtitle line including white space (default: 37). -max_lines: sets maximum number of lines per subtitle segment (default: 2).No

JobConfig

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

NameTypeDescriptionRequired
typestringYes
transcription_configTranscriptionConfigYes

CreateJobResponse

In the job response you will see balance and cost values returned, but these are not used by the appliance; they are only maintained for backwards compatibility with the legacy V1 SaaS, and should be ignored by clients.

NameTypeDescriptionRequired
idstringThe unique ID assigned to the job. Keep a record of this for later retrieval of your completed job.Yes
balanceintegerNot usedNo
costintegerNot usedNo

JobDetails

NameTypeDescriptionRequired
created_atdateTimeThe UTC date time the job was created.Yes
data_namestringName of the data file submitted for job.Yes
durationintegerThe file duration (in seconds). May be missing for fetch URL jobs.No
idstringThe unique id assigned to the job.Yes
statusstringThe status of the job. running - The job is actively running. done - The job completed successfully. rejected - The job was accepted at first, but later could not be processed by the transcriber. deleted - The user deleted the job. * expired - The system deleted the job. Usually because the job was in the done state for a very long time.Yes
configJobConfigYes

RetrieveJobsResponse

NameTypeDescriptionRequired
jobs[JobDetails]Yes

RetrieveJobResponse

NameTypeDescriptionRequired
jobJobDetailsYes

DeleteJobResponse

NameTypeDescriptionRequired
jobJobDetailsYes

JobInfo

Summary information about an ASR job, to support identification and tracking.

NameTypeDescriptionRequired
created_atdateTimeThe UTC date time the job was created.Yes
data_namestringName of data file submitted for job.Yes
durationintegerThe data file audio duration (in seconds).Yes
idstringThe unique id assigned to the job.Yes

RecognitionMetadata

Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.

NameTypeDescriptionRequired
created_atdateTimeThe UTC date time the transcription output was created.Yes
typestringYes
transcription_configTranscriptionConfigNo

RecognitionDisplay

NameTypeDescriptionRequired
directionstringYes

RecognitionAlternative

List of possible job output item values, ordered by likelihood.

NameTypeDescriptionRequired
contentstringYes
confidencefloatYes
languagestringYes
displayRecognitionDisplayNo
speakerstringNo

RecognitionResult

An ASR job output item. The primary item types are word and punctuation. Other item types may be present, for example to provide semantic information of different forms.

NameTypeDescriptionRequired
channelstringNo
start_timefloatYes
end_timefloatYes
typestringNew types of items may appear without being requested; unrecognized item types can be ignored.Yes
alternatives[RecognitionAlternative]Yes

RetrieveTranscriptResponse

NameTypeDescriptionRequired
formatstringSpeechmatics JSON transcript format version number.Yes
jobJobInfoYes
metadataRecognitionMetadataYes
results[RecognitionResult]Yes