The Speechmatics Automatic Speech Recognition REST API is used to submit ASR jobs and receive the results. The supported job types are transcription of audio files, and alignment of audio files with existing transcripts to add word or line timings.
Version: 1.1.0
Contact Email: support@speechmatics.com
Terms of service : https://www.speechmatics.com/terms-and-conditions/
BasePath : /v1 Schemes : HTTP
The base URL http://${APPLIANCE_HOST}:8082/v1/user/1/
is used for REST Speech API requests. If you are using the secure Speech API, the base URL is: https://${APPLIANCE_HOST}/v1/user/1/
. The user ID component can be any positive integer; by convention we use 1 for all requests.
Requests without a job ID component are used to create a new job, or to return a list of all submitted jobs.
Summary: Create a new job.
Parameters
Name | Located in | Description | Required | Schema |
---|---|---|---|---|
config | formData | JSON containing a JobConfig model indicating the type and parameters for the recognition job. | Yes | string |
data_file | formData | The data file to be processed. Alternatively the data file can be fetched from a url specified in JobConfig . | No | file |
Responses
Code | Description | Schema |
---|---|---|
201 | OK | CreateJobResponse |
400 | Bad request | ErrorResponse |
401 | Unauthorized | ErrorResponse |
403 | Forbidden | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Summary: List all jobs.
Responses
Code | Description | Schema |
---|---|---|
200 | OK | RetrieveJobsResponse |
401 | Unauthorized | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Requests with a job ID component are used to view the status, transcript or audio data for a job, or remove a given job from the system.
Summary: Get job details, including progress and any error reports.
Parameters
Name | Located in | Description | Required | Schema |
---|---|---|---|---|
jobid | path | ID of the job. | Yes | string |
Responses
Code | Description | Schema |
---|---|---|
200 | OK | RetrieveJobResponse |
401 | Unauthorized | ErrorResponse |
404 | Not found | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Summary: Delete a job and remove all associated resources.
Parameters
Name | Located in | Description | Required | Schema |
---|---|---|---|---|
jobid | path | ID of the job to delete. | Yes | string |
Responses
Code | Description | Schema |
---|---|---|
200 | The job that was deleted. | DeleteJobResponse |
401 | Unauthorized | ErrorResponse |
404 | Not found | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Summary: Get the data file used as input to a job.
Parameters
Name | Located in | Description | Required | Schema |
---|---|---|---|---|
jobid | path | ID of the job. | Yes | string |
Responses
Code | Description | Schema |
---|---|---|
200 | OK | file |
401 | Unauthorized | ErrorResponse |
404 | Not found | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Summary: Get the transcript for a transcription job.
Parameters
Name | Located in | Description | Required | Schema |
---|---|---|---|---|
jobid | path | ID of the job. | Yes | string |
format | query | The transcripton format (by default the json-v2 format is returned). | No | string |
Responses
Code | Description | Schema |
---|---|---|
200 | OK | RetrieveTranscriptResponse |
401 | Unauthorized | ErrorResponse |
404 | Not found | ErrorResponse |
500 | Internal Server Error | ErrorResponse |
Name | Type | Description | Required |
---|---|---|---|
code | integer | The HTTP status code. | Yes |
error | string | The error message. | Yes |
detail | string | The details of the error. | No |
Name | Type | Description | Required |
---|---|---|---|
language | string | Language model to process the audio input, normally specified as an ISO language code | Yes |
additional_vocab | [object] | List of custom words or phrases that should be recognized. Alternative pronunciations can be specified to aid recognition. | No |
punctuation_overrides | [object] | Control punctuation settings. | No |
diarization | string | Specify whether speaker or channel labels are added to the transcript. The default is none . | No |
channel_diarization_labels | [string] | Transcript labels to use when using collating separate input channels. | No |
For the diarization parameter, the following values are valid:
Value | Description |
---|---|
none | no speaker or channel labels are added. |
speaker | speaker attribution is performed based on acoustic matching; all input channels are mixed into a single stream for processing. |
channel | multiple input channels are processed individually and collated into a single transcript. |
speaker_change | the output indicates when the speaker in the audio changes. No speaker attribution is performed. This is a faster method than speaker. The reported speaker changes may not agree with speaker. |
channel_and_speaker_change | both channel and speaker_change are switched on. The speaker change is indicated if more than one speaker are recorded in one channel. |
If you want the transcription output to be in the SubRip Title (SRT) format, and you want to alter the default parameters Speechmatics provides you must provide the output_confiog within the config object
output_config
Name | Type | Description | Required |
---|---|---|---|
srt_overrides | object | Parameters to override the default parameters for SubRip (srt) subtitle format. - max_line_length : sets maximum count of characters per subtitle line including white space (default: 37). -max_lines : sets maximum number of lines per subtitle segment (default: 2). | No |
JSON object that contains various groups of job configuration
parameters. Based on the value of type
, a type-specific object
such as transcription_config
is required to be present to
specify all configuration settings or parameters needed to
process the job inputs as expected.
Name | Type | Description | Required |
---|---|---|---|
type | string | Yes | |
transcription_config | TranscriptionConfig | Yes |
In the job response you will see balance
and cost
values returned, but these are not used by the appliance; they are only maintained for backwards compatibility with the legacy V1 SaaS, and should be ignored by clients.
Name | Type | Description | Required |
---|---|---|---|
id | string | The unique ID assigned to the job. Keep a record of this for later retrieval of your completed job. | Yes |
balance | integer | Not used | No |
cost | integer | Not used | No |
Name | Type | Description | Required |
---|---|---|---|
created_at | dateTime | The UTC date time the job was created. | Yes |
data_name | string | Name of the data file submitted for job. | Yes |
duration | integer | The file duration (in seconds). May be missing for fetch URL jobs. | No |
id | string | The unique id assigned to the job. | Yes |
status | string | The status of the job. running - The job is actively running. done - The job completed successfully. rejected - The job was accepted at first, but later could not be processed by the transcriber. deleted - The user deleted the job. * expired - The system deleted the job. Usually because the job was in the done state for a very long time. | Yes |
config | JobConfig | Yes |
Name | Type | Description | Required |
---|---|---|---|
jobs | [JobDetails] | Yes |
Name | Type | Description | Required |
---|---|---|---|
job | JobDetails | Yes |
Name | Type | Description | Required |
---|---|---|---|
job | JobDetails | Yes |
Summary information about an ASR job, to support identification and tracking.
Name | Type | Description | Required |
---|---|---|---|
created_at | dateTime | The UTC date time the job was created. | Yes |
data_name | string | Name of data file submitted for job. | Yes |
duration | integer | The data file audio duration (in seconds). | Yes |
id | string | The unique id assigned to the job. | Yes |
Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.
Name | Type | Description | Required |
---|---|---|---|
created_at | dateTime | The UTC date time the transcription output was created. | Yes |
type | string | Yes | |
transcription_config | TranscriptionConfig | No |
Name | Type | Description | Required |
---|---|---|---|
direction | string | Yes |
List of possible job output item values, ordered by likelihood.
Name | Type | Description | Required |
---|---|---|---|
content | string | Yes | |
confidence | float | Yes | |
language | string | Yes | |
display | RecognitionDisplay | No | |
speaker | string | No |
An ASR job output item. The primary item types are word
and punctuation
. Other item types may be present, for example to provide semantic information of different forms.
Name | Type | Description | Required |
---|---|---|---|
channel | string | No | |
start_time | float | Yes | |
end_time | float | Yes | |
type | string | New types of items may appear without being requested; unrecognized item types can be ignored. | Yes |
alternatives | [RecognitionAlternative] | Yes |
Name | Type | Description | Required |
---|---|---|---|
format | string | Speechmatics JSON transcript format version number. | Yes |
job | JobInfo | Yes | |
metadata | RecognitionMetadata | Yes | |
results | [RecognitionResult] | Yes |