This release provides substantially improved speaker diarization. This release updates the language packs for Swedish and Arabic. It also adds metadata tags for disfluencies in English in the JSON output and updates the Linux Ubuntu OS on which the Container runs.
The legacy V1 API and related output formats is no longer supported. V1 examples have been removed from all batch container documentation and will no longer work. We recommend use of the V2 API and the config.json for all supported features object as documented in the Speech API guide for 8.1.0.
It is recommended to use processors that support Advanced Vector Extensions 2 (AVX2) when running the container in order to take advantage of latest performance optimisations. AVX is still supported.
json-v2
and txt
output. Speaker gender identification is no longer a supported featuretxt
format, and requesting no diarization, there will be no Speaker:UU
at the start of a transcriptjson-v2
API schema has been updated from v2.4 to v2.6The following issues are addressed since the previous release:
Issue ID | Summary | Resolution Description |
---|---|---|
REQ-11135 | Bellini release introduced unwanted hesitations in transcripts | Where disfluencies ('hmm' or 'um') now exist in the english language pack, they have an additional "disfluency" tag which allows customers to spot and therefore remove or analyze them if they so wish |
REQ-17771 | Wide-space Unicode characters in Custom Dictionary cause a jobs to fail | This is now fixed and wide-spaced characters should be accepted |
Issue ID | Summary | Detailed Description and Possible Workarounds |
---|---|---|
REQ-1409 | Proteus HCL with <unk> causes out of memory error | A custom dictionary list that contains the word ' |
REQ-10160 | Advanced punctuation for Spanish (es) does not contain inverted marks. | Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation. |
REQ-10627 | Double full stops when acronym is at the end of the sentence | If there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.." |
REQ-10634 | Putting "-" as an item in additional vocab configuration will cause the container to fail | Do not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property . Hyphens are still supported when entered as part of phrases or words |
Docker (17.06.0+) running on Ubuntu, Debian, Fedora or CentOS.
Pull the Batch Container Docker image from the Speechmatics Docker repository.
You have a login (URL, username and password) for the Speechmatics Docker repository, and have a Docker environment (version 17.06.0 or above) running.
Below is the complete list of languages supported by Speechmatics:
Container images are labelled using the following scheme, where language codes adhere the ISO-639 standard:
batch-asr-transcriber-<language>:<version>
For example,
batch-asr-transcriber-en:8.1.0