System requirements

The Speechmatics Real-time Virtual Appliance operates on a hypervisor host system. For this version of the appliance, the following hypervisors are supported:

  • VMware®
  • VirtualBox
  • AWS EC2

For the virtual appliance to operate as required, the host must meet the requirements and have the resources available as defined below.

Host requirements

The host machine requires a processor with following microarchitecture specification:

  • If using the standard model offering at least the Broadwell Class is required
  • If using the enhanced model a chip that is at least the CascadeLake class is required, as this is the minimum spec that will support AVX512_VNNI - for more information see below.
  • It is also recommended if using the enhanced model that the hardware supports the AVX512_VNNI flag, as this will greatly improve transcription processing speed
    • Examples of this among popular hosting providers include the Microsoft Azure DSV-4 class, and the Amazon M5n EC2 server class
    • If you are using the enhanced model and running on VMWare, you will have to upgrade to hardware_version 18 to take advantage of the AVX512_VNNI flag. Please note this is only supported by ESXi version 7.0 onwards
    • If you are using VMWare and the enhanced model, and encounter performance issues, we recommend allocating dedicated memory and/or processors to the appliance. How to apply dedicated processors in VMWare is documented here, setting memory is documented here
  • If you encounter performance issues when running the enhanced model, disabling hyperthreading when running the enhanced model can also improve transcription speed. How to do so when running on Amazon Web Services is shown here, and for Microsoft Azure please see here

AVX flags

The hardware you run the appliance on must support Advanced Vector Extensions (AVX). Advanced Vector Extensions are necessary to allow Speechmatics to carry out transcription:

  • For the standard model, it is necessary to use at least a processor that supports at least Advanced Vector Extensions 2 (AVX2).
    • You should also ensure your hypervisor is enabled to use AVX2.
  • For the enhanced model, it is recommended to run the appliance on hardware that supports the AVX512_VNNI flag in addition to AVX2, which will substantially improve transcription processing speed.

To see what AVX flags are supported by the CPU of your host system, you can run the following query via the Management API of the appliance:

curl -X GET "https://{HOSTAPPLIANCE}/v1/management/cpuinfo" -H "accept: application/json"

You will receive information about the host CPU. Supported AVX flags will be returned as flags in the Management API response. An example is below:

{
  "usage_percentage": 2.5,
  "architecture": "X86_64",
  "model_name": "Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz",
  "cpus": "2",
  "vendor": "GenuineIntel",
  "hyperthreading": false,
  "flags": "3dnowprefetch abm adx aes apic arat arch_capabilities arch_perfmon avx avx2 avx512_vnni bmi1 bmi2 clflush cmov constant_tsc cpuid cpuid_fault cx16 cx8 de f16c flush_l1d fma fpu fsgsbase fxsr hypervisor ibpb ibrs invpcid invpcid_single lahf_lm lm mca mce md_clear mmx movbe msr mtrr nonstop_tsc nopl nx pae pat pcid pclmulqdq pdpe1gb pge pni popcnt pse pse36 pti rdrand rdseed rdtscp sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tsc tsc_adjust tsc_deadline_timer tsc_reliable vme x2apic xsave xsaveopt xtopology"
}

Useful links

See below for minimum Real-time Virtual Appliance VM (guest) specifications; the host machine must have enough resources (processor, memory and storage) to run the hypervisor, the guest VMs you intend to host on it, plus any other processes you expect to run on it. Vendor guidelines should be followed for other host requirements and installation process.

For VMWare, the document Performance Best Practices for VMware vSphere® 6.0 contains a comprehensive overview of hardware considerations and recommendations on how to optimize your host platform. See https://www.vmware.com/support.html for up-to-date technical information on VMWare.

For VirtualBox, please consult the online documentation: https://www.virtualbox.org/wiki/Documentation

For Amazon EC2, the following link explains how to setup a VM using an Amazon S3 to store the OVA file: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-image-import.html.

Virtual Appliance system requirements

Real-time Virtual Appliance

The Speechmatics Real-time Real-time Virtual Appliance must be allocated the following minimum specification:

  • 2 vCPU
  • 8GB RAM
  • Up to 38GB hard disk space

For each concurrent input stream using the standard model the appliance requires an additional 1 vCPU and at least 1.5GB RAM.

If you are using the custom dictionary (additional words) feature then each concurrent input stream that is configured to use it will require up to 3GB RAM.

If you are using the enhanced model, then each concurrent input stream that is configured to use it will require up to 3GB RAM. If the enhanced model is used in conjunction with other features like Custom Dictionary and encountering performance issues, then up to 5GB may be required.

Batch Virtual Appliance

For operation in batch mode, the following minimum specifications are required:

  • 2 vCPUs
  • 8GB RAM
  • Up to 44GB hard disk space

Important Message on IOPS

Heavy usage of the appliance at scale can sometimes result in very high percentage usage of volume throughput. If this is the case, we recommend increasing the maximum IOPs supported by your hardware to a value between 8,000-12,000. This is not necessary in all circumstances, but may result in better performance if you are running more than 10 concurrent workers. Increasing the IOPS also will result in an increase in cost for resource usage. If you use AWS, setting the volume type to io2 is also recommended in this scenario. How to change the maximum IOPS supported by your hardware is documented here for AWS, here for Microsoft Azure, and here for VMWare. You may need to do this if:

  • You are using close to or the maximum number of workers supported by that appliance size
  • The jobs being processed are all long files, and diarization is requested