The legacy V1 API and related output formats is no longer supported. V1 API examples have been removed from all batch container documentation. We recommend use of the V2 API and the config.json object documented in the Speech API. How to use the V2 API is documented within the Speech API document for 7.0.0.
/work
, but is instead /home/smuser/work
The following issues are addressed since the previous release:
Issue ID | Summary | Resolution Description |
---|---|---|
REQ-15418 | Custom dictionary with splitting characters gets incorrect pronunciation | When using words with splitting characters in a Custom Dictionary (for example covid-19) where a number follows a word we now have the correct pronunciations created. Splitting characters include ["-", "_", "/", "<", ">", ":", " "]. This is for all languages For v7.0.1 only |
REQ-13442 | Some unicode characters would cause transcription to fail | This has now been resolved |
REQ-13990 | The batch container will not run as a non-root user on Docker | This is now supported. Guidance on how to do this is in the Quick Start Guide |
REQ-14062 | Occasionally a file in Spanish would not be fully transcribed | This has been resolved with the latest release of Spanish |
Issue ID | Summary | Detailed Description and Possible Workarounds |
---|---|---|
REQ-1409 | Proteus HCL with <unk> causes out of memory error | A custom dictionary list that contains the word ' |
REQ-10160 | Advanced punctuation for Spanish (es) does not contain inverted marks. | Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation. |
REQ-10627 | Double full stops when acronym is at the end of the sentence | If there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.." |
REQ-11135 | A previous release (6.1.0) introduced unwanted hesitations in transcripts. | Due to changes in the way that training data is now ingested to improve the accuracy of spontaneous speech for English (en) there is a greater likelihood that hesitations will be included in the output transcripts. We plan to support a hesitation filtering capability in a future release for customers that do not want to see hesitations on transcripts. |
Docker (17.06.0+) running on Ubuntu, Debian, Fedora or CentOS.
Pull the Batch Container Docker image from the Speechmatics Docker repository.
You have a login (URL, username and password) for the Speechmatics Docker repository, and have a Docker environment (version 17.06.0 or above) running.
For a complete list of languages that are supported by the Speechmatics Container, including those which have custom dictionary support, please go to the Speechmatics website: https://www.speechmatics.com/language-support/