This is the documentation for a previous version of our product. Click here to see the latest version.

Real-time Virtual Appliance

High-Level Summary

This release updates the language packs for Swedish and Arabic. It also adds metadata tags for disfluencies in English in the JSON output and updates Linux Ubuntu OS on which the appliance runs.

Currently, Speechmatics supports 2 python libraries for use with our Real-time products. smwebsocket-py is recommended for use for the Real-time Virtual Appliance only, and speechmatics-python is recommended for use in both our Real-time Container and our Real-time Virtual Appliance. In a future release we will exclusively support speechmatics-python as our preferred Python library. We recommend you familiarise yourself with this library and use wherever possible. Please contact support@speechmatics.com if you require access to this library.

This release makes changes to the buffer when sending addAudio messages. If an AudioAdded message is now sent, it means that the audio is definitively ready for transcription. The RTVA has a buffer of up to 10 seconds of speech, or 500 add Audio messages per worker. If this buffer is exceeded, no further AudioAdded messages messages will be returned from the Appliance for that specific worker until the buffer has capacity. Please ensure any integration you have to the Real-time Virtual Appliance is able to tolerate this buffer by ensuring that sending and receiving messages runs in another thread or uses some other mechanism to avoid getting blocked. For the Real-time virtual appliance where audio is sent faster than real-time, it is recommended to use a semaphore of size 500 or audio length of 10 seconds to avoid any unnecessary memory consumption.

If you are importing an appliance using VMWare, please note that the hardware_version of the appliance has been updated from 9 to 11. This is to automatically take advantages of performance optimisation using Advanced Vector Extensions 2 (AVX2). This should have no effect on the appliance assuming you are on a version of VMWare ESXi supported by Speechmatics (versions 6.5 onwards). If you are importing an appliance through VirtualBox, and AVX2 is not automatically enabled, you can also take advantage of the the performance benefits from AVX 2 following these guidelines.

It is recommended to run the appliance on processors that support AVX2 in order to take advantage of latest performance optimisations.

What's New

Disfluency tagging in English
Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
The Host VM of the Appliance now runs on Ubuntu Bionic (OS 18.04)
The API version has been updated to 2.6

Issues Fixed

The following issues are addressed since the previous release:

Issue ID	Summary	Resolution Description
REQ-10634	Putting "-" as an item in `additional vocab` configuration will cause the container to fail	Do not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the `sounds_like property`. Hyphens are still supported when entered as part of phrases or words
REQ-11136	Transcripts are direct written to the Real-time Virtual Appliance logs	Transcripts are no longer written directly to the logs or persisted to disk, even temporarily, for security reasons.
REQ-14795	Configuration information was not written to logs in StartRecognitionMessage	Transcription Configuration information is now logged as part of th StartRecognitionMessage. Individual custom dictionary entries are redacted
REQ-15515	`AudioAdded` messages sent prematurely	Previously `AudioAdded` messages were sent immediately, which could lead to an infinite buffer. Now, if an `AudioAdded` message is now sent, it means that the audio is definitively ready for transcription. If you are sending audio faster than real-time and send more than 500 addAudio messages of 10 seconds of Audio you will not receive an audioAdded response until there is capacity again. Please ensure your client connection is resilient to avoid audio being dropped
REQ-17771	Wide-space Unicode characters in Custom Dictionary cause jobs to fail	This is now fixed and wide-spaced characters should be accepted

Known Limitations

The following are known issues in this release:

Issue ID	Summary	Detailed Description and Possible Workarounds
REQ-1409	Proteus HCL with `<unk>` causes out of memory error	A custom dictionary list that contains the word '' causes the worker to crash.
REQ-7549	Memory leak affecting gRPC	There is a small memory leak in the gRPC Python server https://github.com/grpc/grpc/issues/5913.
REQ-10160	Advanced punctuation for Spanish (es) does not contain inverted marks.	Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation.
REQ-10627	Double full stops when acronym is at the end of the sentence	If there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.."
REQ-11792	Speaker change token positioning is incorrect	We are aware of a consistent mis-placing of the speaker change token after the first word of the new speakers' sentence rather than before it.
REQ-12202	High memory usage when using custom dictionary	It has been observed that when using custom dictionary an additional 800-1700MB of memory is required (depending on the size of the wordlist used).
REQ-16256	Heavy usage of RAM when swapping between 8kHz and 16kHz input	Where multiple persistent workers are configured with Custom Dictionary that swap between 8kHz and 16kHz input, this can cause a memory leak that causes the container to crash. If this starts to impact services it is recommended to restart all the services with the management API or drop the worker count to 1 and then increase it again

Supported Platforms

Virtual Appliance image (OVA) for installation on:

VMware ESXi 6.5+ or VMware Workstation Player.
VirtualBox 5.2+
Amazon EC2

See the Installation and Admin Guide for details on the minimum specifications for the VM. The maximum number of concurrent jobs (maxworkers) that you can run on a single appliance is 30.

Form Factors

There are five variants of the Real-time Virtual Appliance.

Variant	Image Size	Max. Disk Space	Languages
nano	10GB	40GB	en
mini	15GB	40GB	en, de, es
midi	27GB	60GB	en, de, es, fr, ko, ja, nl, pt
maxi	44GB	80GB	en, de, es, fr, ko, ja, nl, pt, it, da, pl, ca, hi, ru, sv
plus	45GB	80GB	en, cmn, no, ar, bg, cs, el, fi, hu, hr, lt, lv, ro, sk, sl, tr, ms

Upgrade Path

Remove the license from your old appliance (see the Admin Guide), then re-import the new OVA and configure networking as per the Installation and Admin guide. You will need to re-apply the license code you have once the OVA has imported.

Installation

Upload the OVA to VMWare ESX, VMWare Workstation Player, or VirtualBox. See the Installation and Admin Guide for more information.

Getting started

Real-time Virtual Appliance

Install/Admin guide