/
Real-time Container
/
Release Notes
This is the documentation for a previous version of our product. Click here to see the latest version.

Real-time Container

High Level Summary

This release updates the language packs for Swedish and Arabic. It also adds metadata tags for disfluencies in English in the JSON output and updates Linux Ubuntu OS on which the Container runs.

Important Notice

This release makes changes to the buffer when sending AddAudio messages. If an AudioAdded message is now sent, it means that the audio is definitively ready for transcription. The Real-time Container has a buffer of up to 10 seconds of speech, or 500 add Audio messages. If this buffer is exceeded, no further AudioAdded messages messages will be returned from the Container until the buffer has capacity. Please ensure any integration you have to the Real-time Container is able to tolerate this buffer by ensuring that sending and receiving messages runs in another thread or uses some other mechanism to avoid getting blocked. For the Real-time virtual Container where audio is sent faster than real-time, it is recommended to use a semaphore of size 500 or audio length of 10 seconds to avoid any unnecessary memory consumption.

It is recommended to run Speechmatics containers on processors that support Advanced Vector Extensions 2 (AVX2) in order to take advantage of latest performance optimisations.

What's New

1.3.0

  • Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
  • Disfluency tagging in English
  • The API version has been updated to 2.6
  • The Container now runs on Ubuntu Focal (OS 20.04)
  • Speechmatics containers are now built using Docker Buildkit. If you have an internal registry to host the Speechmatics container which uses both Nexus and self-signed certificates, please make sure you are on Nexus version 3.15 or above or you may encounter errors.

Known Limitations

The following are known issues in this release:

Issue IDSummaryDetailed Description and Possible Workarounds
REQ-10634Putting "-" as an item in additional vocab configuration will cause the container to failDo not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property. Hyphens are still supported when entered as part of phrases or words
REQ-13240Chinese (cmn) container crashes occasionally when using certain additional vocabularyDo not use whitespace characters in additional vocabulary sounds_like
REQ-16256Audio Swapping between 8kHz and 16kHz causes memory leakRepeatedly audio swapping between 8kHz and 16kHz files can cause an increase in memory over very long periods that causes the container to crash. If memory usage in this scenario becomes excessive it is recommended to restart the container
REQ-17771Wide-space Unicode characters in Custom Dictionary cause a jobs to failThis is now fixed and wide-spaced characters should be accepted

Resolved Issues

The following is a list of any resolved issues within this release

The following issues are addressed since the previous release:

Issue IDSummaryResolution Description
REQ-11135Unwanted hesitations in transcripts.For the English language pack Speechmatics now tags hesitation words ('umm') with a metadata tag of "disfluency". Users can use this tag for post-processing to filter or analyze such words. This work does not make disfluencies better or more poorly recognised in transcript output
REQ-11136Transcripts are direct written to the Real-time Virtual Appliance and Container logsTranscripts are no longer written directly to the logs or persisted to disk, even temporarily, for security reasons.
REQ-14795Configuration information was not written to logs in StartRecognitionMessageTranscription Configuration information is now logged as part of th StartRecognitionMessage. Individual custom dictionary entries are redacted
REQ-15515Internal buffer limit of 500 AddAudio messages/10 seconds of audioThe Container now has a buffer. If you are sending audio faster than real-time and send more than 500 AddAudio messages of 10 seconds of Audio you will not receive an audioAdded response until there is capacity again. Please ensure your client connection is resilient to avoid audio being dropped

Supported Languages

These are the General Availability (GA) release notes for the Real-time ASR container images. Following languages are supported:

  • English (en)
  • German (de)
  • Spanish (es)
  • French (fr)
  • Portuguese (pt)
  • Japanese (ja)
  • Korean (ko)
  • Dutch (nl)
  • Italian (it)
  • Swedish (sv)
  • Danish (da)
  • Polish (pl)
  • Catalan (ca)
  • Hindi (hi)
  • Russian (ru)
  • Mandarin (cmn)
  • Norwegian (no)
  • Arabic (ar)
  • Bulgarian (bg)
  • Czech (cs)
  • Greek (el)
  • Finnish (fi)
  • Hungarian (hu)
  • Croatian (hr)
  • Lithuanian (lt)
  • Latvian (lv)
  • Romanian (ro)
  • Slovak (sk)
  • Slovenian (sl)
  • Turkish (tr)
  • Malay (ms)

Container images are labelled using the following scheme, where language codes adhere the ISO-639 standard:

rt-asr-transcriber-<language>:<version>

For example,

rt-asr-transcriber-en:1.3.0

Supported Platforms

Docker 17.06.0+

Installation

Pull the container image from the Speechmatics Docker registry.

Prerequisites

  • Docker (17.06.0 or above).
  • Login credentials (URL, username and password) for the Speechmatics Docker registry.