Services

The virtual appliance has internal services that are required for operation.

There are system-wide services, and services specific to transcription workers for a given language.

Batch Virtual Appliance

For the Batch Virtual Appliance, this table lists the services:

Service Name (Begins with)	Description	Required Status
batch_bja...	V2 REST API	Running.
batch_rpc_gateway...	RPC endpoint	Running
batch_license...	Licensing service	Running
batch_linkerd...	Internal Networking	Running
batch_management...	Management functions	Running
batch_ba_worker...	Job Queue management	Running
batch_monitoring_ui...	Monitoring Web GUI	Running
batch_batch-cron...	Completed job clean-up	Running
batch_v1compatibility...	V1 REST API	Running
jobs...	Used to perform ASR and transcription	Running
batch_swaggerui...	Swagger UI for certain APIs	Running
batch_nginxlb...	HTTP gateway	Running
batch_postgres...	Jobs Database	Running

The service will always have a current state, these states include:

Service Status	Description
running	Service has started and is running
created	Service is in the process of starting
exited	Service has stopped and is no longer running

Service status

This can be used to ensure all services have the required status to operate (see table above). Example: GET to list services and corresponding status:

curl -L -X GET 'http://${APPLIANCE_HOST}:8080/v1/management/services' \
    -H 'Accept: application/json' \
    | jq

If the appliance has been licensed then you will see a return like this (for the Batch Virtual Appliance):

{
  "service_status": [
    {
      "service": "job-50",
      "status": "running"
    },
    {
      "service": "batch_bja.1.qegys910pamsduryf9tujm2db",
      "status": "running"
    },
    {
      "service": "batch_swaggerui.1.0limj506dokkscu4mvy00gt70",
      "status": "running"
    },
    {
      "service": "batch_rpc_gateway.1.l0aoi8f9cvkcko8s5jhrio8b6",
      "status": "running"
    },
    {
      "service": "batch_batch-cron.1.uahr5xz4edjx11fm06bflhthx",
      "status": "running"
    },
    {
      "service": "batch_v1compatibility.1.5t9hbwk30zqt2cnx5xzjf9zkt",
      "status": "running"
    },
    {
      "service": "batch_nginxlb.1.p2mq6ho4k5hho180zkog2maej",
      "status": "running"
    },
    {
      "service": "batch_license.1.urx4q1zru7430lhv9669h9xxy",
      "status": "running"
    },
    {
      "service": "batch_management.1.5r92dvzwu0021g7mc9pb7qtg0",
      "status": "running"
    },
    {
      "service": "batch_postgres.1.yvef8y8g8tq8nt62bc6ow987z",
      "status": "running"
    },
    {
      "service": "batch_monitoring_ui.1.m29c6ne7621y6dapq5fjojxj3",
      "status": "running"
    },
    {
      "service": "batch_linkerd.1.30ng6rrqiar7fqgkb9tesn9uw",
      "status": "running"
    },
    {
      "service": "batch_ba_worker.1.yliwg0uynenv2jcno9x423brc",
      "status": "running"
    }
  ]
}

Real-time Virtual Appliance

For the Real-time Virtual Appliance, this table lists the services:

Service Name (Begins with)	Description	Required Status
rt_rt-server...	Load-balancing handling job requests	Running
rt_linkerd....	Proxy	Running
rt_management...	MGMT API Calls	Running
appliance_autoscaler...	required only during OVA build	Exited
rt_redis...	Handles worker availability	Running
rt_rpc_gateway...	Internal service management	Running
rt_monitoring_ui...	Monitoring Web GUI	Running
rt_nginx...	Proxying requests	Running
rt_rt-janitor...	Completed job clean-up	Running
rt_license...	Licensing	Running
rt_autoscaler...	Used to perform ASR and transcription	Running

The service will always have a current state, these states include:

Service Status	Description
running	Service has started and is running
created	Service is in the process of starting
exited	Service has stopped and is no longer running

Service status

curl -L -X GET 'http://${APPLIANCE_HOST}:8080/v1/management/services' \
    -H 'Accept: application/json' \
    | jq

This can be used to ensure all services have the required status. If successful you will see the following response

{
  "service_status": [
    {
      "service": "rt_rt-server.1.jgwwfsybbxmdq8205dqdzb2r4", 
      "status": "running"
    },
    {
      "service": "rt_linkerd.1.tetkusm9u3iowqn2w71ok2nfp", 
      "status": "running"
    },
    {
      "service": "rt_management.1.wk2kse9inpaie5nnby57zgjck", 
      "status": "running"
    },
    {
      "service": "appliance_autoscaler-bootstrap-task_run_f92039b26280", 
      "status": "exited"
    },
    {
      "service": "rt_redis.1.osd52r5esip3cvpsa3bsyfa3o", 
      "status": "running"
    },
    {
      "service": "rt_rpc_gateway.1.mhb1yk8i50qxqs50jmu573u2o", 
      "status": "running"
    },
    {
      "service": "rt_monitoring_ui.1.qzir2168b01zroej5kh1gac0x", 
      "status": "running"
    },
    {
      "service": "rt_nginxlb.1.z9uwrh458ttct6mg2ii1cp427",
      "status": "running"
    },
    {
      "service": "rt_rt-janitor.1.1eqrp4vre3eqg213uceye41zm", 
      "status": "running"
    },
    {
      "service": "rt_license.1.jeop3k5hscque3vw9qo24jmtu", 
      "status": "running"
    },
    {
      "service": "rt_autoscaler.1.jbpngc1rokzf7zs7i7r97uxij", 
      "status": "running"
    }
  ]
}

Service restart

Note: After a service is restarted it will have a random string identifier post fixed to its name.

If required for troubleshooting you may need to restart all the services. During the restart, all transcription will stop. The following command performs a service restart:

$ curl -X DELETE 'http://<APPLIANCE HOST>:8080/v1/management/services' \
    -H 'Accept: application/json'

The individual services on the system provide log files that can be collected to help with troubleshooting. The service name will need to be provided when retrieving logs. See above for instructions on how to view the names of the running services

The following parameters are available when accessing logs:

Name	Description	Required Status
name	Name of the service to collect the logs for	Required
count	Number of log lines wanted, defaults to 100; if all lines are to be returned set to -1	Optional

Example: GET to retrieve logs for batch_monitoring_ui service:

curl -L -X GET 'http://${APPLIANCE_HOST}:8080/v1/management/logs/batch_monitoring_ui.1.mtvn0r47qb7durnl0fmuimsc0' \
    -H 'Accept: application/json' \
    | jq -r '.log_lines'

If you want to download all the logs (in order to provide information for a support ticket for instance) as a ZIP file, then it is possible to do this using the following command:

curl -L -X GET 'http://${APPLIANCE_HOST}:8080/v1/management/logs/zip' \
    -H 'Accept: application/json' \
    -o ./speechmatics.zip

It is also possible to do this directly from the Swagger UI by entering in the following URL to your browser: http://${APPLIANCE_HOST}:8080/docs/#/Management/ZipLogs, and then clicking on the download link when the ZIP file is ready.

Download log files (ZIP) from Swagger UI

System restart

If the virtual appliance becomes unresponsive, there might be a need to restart it. If this is the case, it's recommended that the system is restarted through the management API, like this:

curl -L -X DELETE 'http://${APPLIANCE_HOST}:8080/v1/management/reboot'

If the Management API is not available, then you should reboot the appliance from the hypervisor console. For further information on how to restart the virtual machine via the console, please follow the manufacturers advice.

System shutdown

You may wish to shut down the appliance. If so, it's recommended that the system is shut down through the management API, like this:

curl -L -X DELETE 'http://${APPLIANCE_HOST}:8080/v1/management/shutdown'

If the Management API is not available, then you should shut down the appliance from the hypervisor console. For further information on how to shut down the virtual machine via the console, please follow the manufacturers advice.

Troubleshooting

There may be times unexpected behavior is observed with the Batch Virtual Appliance. If this is the case the following should be performed/checked:

Check the license is valid (see licensing)
Check the worker services are running
Check the resources (CPU, memory & disk) to ensure they are not exhausted
Restart all the services
Restart the virtual appliance
Collect logs and contact Speechmatics support: support@speechmatics.com.

Transcription job failure

If your transcription job fails with an error job status, more information can be found by looking at the logs from the jobs container (using the Management API, as previously described). Search the logs for the job id corresponding with your failure. If you see a SoftTimeLimitExceeded exception, this indicates that the job took longer than anticipated and as such was terminated. This is typically caused by poor VM performance, in particular slow disk IO operations (IOPS). If issues persist it may be necessary to improve the disk IO performance on the underlying host, or you may need to increase the RAM available to the VM such that memory caches can be taken advantage of. Please consult the section above on Host requirements, and the optimization advice specific to your hypervisor to ensure that you are not over-committing your compute resources.

Illegal instruction errors

If jobs fail repeatedly and you see Illegal instruction errors in the log information for these jobs then it is likely that the host hardware you are running on does not support AVX. The host machine requirements for the Batch Virtual Appliance must meet the following minimum specification: Intel® Xeon® CPU E5-2630 v4 (Sandy Bridge) 2.20GHz (or equivalent). This is important because these chipsets (and later ones) support Advanced Vector Extensions (AVX). The machine learning algorithms used by Speechmatics ASR require the performance optimizations that AVX provides.

You can check this by looking in the management log when the appliance starts up. If you see a message like this:

2019-03-26 16:53:07,136 sm_management.app   ERROR   Processor not AVX capable. Tensorflow language models cannot run.

Then it means that your host's CPU does not support AVX, or that your hypervisor does not have AVX support.

A console is available to help with advanced troubleshooting in the event that the Management API is unavailable. It is described in the next section.

AVX2 Warning

Speechmatics Appliance is optimised for running on hardware that supports the AVX2 flag. If you see the below message, your hardware is not optimised, and you may see slower performance of jobs

WARNING ([5.5.675~1-0c22]:SetupMathLibrary():asrengine/asrengine.cc:356) Unable to set CNR mode to 10 (AVX2); falling back to 9. The transcription might be slower and/or use more CPU resource.

Console for Advanced Troubleshooting

In the event that the Management API is unavailable (it is unresponsive, or there is no network connectivity) you can use the console to restore network connectivity, restart the appliance, or view information about services. To use this you need to use your hypervisor's GUI to access the logon screen for the appliance.

From this screen use the CTRL+ALT+F5 key combination to get to the console. Once you are in the console you have the following menu options available:

License
Networking
Reboot
Services
Shutdown
Tools
Workers

The home screen shows high-level information about the appliance: IP addressing, software version and license status.

In the System status panel the API responding indicator shows the state of the Management API. Network status shows the IP address the appliance is currently configured with, and ASR status shows the license state and available storage space on the appliance.

In the event that you need to provide information to Speechmatics support you may be asked to connect to the console and provide this information. This section provides some tips on how to use the console to perform basic troubleshooting yourself.

Note: We recommend that you use the Management API for most troubleshooting tasks as it is easier to use. The console can be used in the event that the Management API is unavailable, but it does not provide all the features of the Management API.

License

The Licensing Troubleshooting section provides detailed instructions on how to use the Management API to resolve common licensing issues. If you cannot use the Management API then you can still use console to check the license status and perform basic licensing steps.

Monitoring

Administration

Security

Services

Batch Virtual Appliance

Service status

Real-time Virtual Appliance

Service status

Service restart

Access Logs

System restart

System shutdown

Troubleshooting

Transcription job failure

Illegal instruction errors

AVX2 Warning

Console for Advanced Troubleshooting

License

Networking

Reboot and Shutdown

Security

Services

Tools

Workers