/
Batch Container
/
Release Notes
This is the documentation for a previous version of our product. Click here to see the latest version.

Batch Container

High-Level Summary

This release provides substantially improved speaker diarization. This release updates the language packs for Swedish and Arabic. It also adds metadata tags for disfluencies in English in the JSON output and updates the Linux Ubuntu OS on which the Container runs.

Important Notices

The legacy V1 API and related output formats is no longer supported. V1 examples have been removed from all batch container documentation and will no longer work. We recommend use of the V2 API and the config.json for all supported features object as documented in the Speech API guide for 8.1.0.

It is recommended to use processors that support Advanced Vector Extensions 2 (AVX2) when running the container in order to take advantage of latest performance optimisations. AVX is still supported.

What's New

8.1.0

  • Improved speaker diarization
    • Speaker diarization has been completely re-designed internally and should now be significantly more accurate
    • Instead of gendered speaker labels (M1, F2) speaker labels will be now (S1, S2 etc.) in the json-v2 and txt output. Speaker gender identification is no longer a supported feature
    • If requesting an output in txt format, and requesting no diarization, there will be no Speaker:UU at the start of a transcript
    • Users may still request speaker diarization as before via the configuration object
    • Beta sensitivity parameters will be removed. The parameters will remain within the API but will not have any effect
    • This update to speaker diarization feature does mean the turnaround time for your transcript will take longer (see documentation section on "Speaker Diarization" for further details)
  • Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
  • Disfluency tagging in English. Certain words in English only that imply hesitation (e.g. 'hmm') will have a metadata tag of disfluency in the json-v2 output
  • The Container has been updated to run on Ubuntu Focal (OS 20.04)
  • The json-v2 API schema has been updated from v2.4 to v2.6
  • Speechmatics containers are now built using Docker Buildkit. If you have an internal registry to host the Speechmatic container which uses both Nexus and self-signed certificates, please make sure you are on Nexus version 3.15 or above or you may encounter errors.

Issues Fixed

The following issues are addressed since the previous release:

Issue IDSummaryResolution Description
REQ-11135Bellini release introduced unwanted hesitations in transcriptsWhere disfluencies ('hmm' or 'um') now exist in the english language pack, they have an additional "disfluency" tag which allows customers to spot and therefore remove or analyze them if they so wish
REQ-17771Wide-space Unicode characters in Custom Dictionary cause a jobs to failThis is now fixed and wide-spaced characters should be accepted

Known Limitations

Issue IDSummaryDetailed Description and Possible Workarounds
REQ-1409Proteus HCL with <unk> causes out of memory errorA custom dictionary list that contains the word '' causes the worker to crash.
REQ-10160Advanced punctuation for Spanish (es) does not contain inverted marks.Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation.
REQ-10627Double full stops when acronym is at the end of the sentenceIf there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.."
REQ-10634Putting "-" as an item in additional vocab configuration will cause the container to failDo not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property. Hyphens are still supported when entered as part of phrases or words

Supported Platforms

Docker (17.06.0+) running on Ubuntu, Debian, Fedora or CentOS.

Installation

Pull the Batch Container Docker image from the Speechmatics Docker repository.

Pre-requisites

You have a login (URL, username and password) for the Speechmatics Docker repository, and have a Docker environment (version 17.06.0 or above) running.

Related Documentation

  • Speechmatics Batch Container Quick Start Guide version 8.1.0
  • Speechmatics Batch Container API Guide version 8.1.0

Supported Languages

Below is the complete list of languages supported by Speechmatics:

  • English (en)
  • German (de)
  • Spanish (es)
  • French (fr)
  • Portuguese (pt)
  • Japanese (ja)
  • Korean (ko)
  • Dutch (nl)
  • Italian (it)
  • Swedish (sv)
  • Danish (da)
  • Polish (pl)
  • Catalan (ca)
  • Hindi (hi)
  • Russian (ru)
  • Mandarin (cmn)
  • Norwegian (no)
  • Arabic (ar)
  • Bulgarian (bg)
  • Czech (cs)
  • Greek (el)
  • Finnish (fi)
  • Hungarian (hu)
  • Croatian (hr)
  • Lithuanian (lt)
  • Latvian (lv)
  • Romanian (ro)
  • Slovak (sk)
  • Slovenian (sl)
  • Turkish (tr)
  • Malay (ms)

Container images are labelled using the following scheme, where language codes adhere the ISO-639 standard:

batch-asr-transcriber-<language>:<version>

For example,

batch-asr-transcriber-en:8.1.0