Large vocabulary continuous speech recognition, also called
voice-to-text conversion is the
key technology for enabling content-based information access in audio and video
documents. Once automatically processed the linguistic information and metadata
in the structured document are available for further downstream processing,
providing direct access to relevant portions of audio documents. Among the most
common applications of our technology are audio and audiovisual data mining
(broadcast and telephone data), speech analytics, media monitoring, media asset
management, speech transcription and subtitling.
We provide solutions and expertise for core speech processing technologies in
many languages. For example, speech to text transcription is available for the
Arabic, Cantonese, Czech, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi,
Hungarian, Italian, Latvian, Lithuanian, Mandarin, Pashto, Persian, Polish,
Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish and Urdu
languages, with several others under development. Our language identification
module identifies the spoken language from a set of 82 languages, and clients
can create models for their desired language set. We also work with our
clients to adapt, tune or create specific models or systems tailored to their
The VoxSigma speech recognition software suite is the latest generation of
transcription software offered by Vocapia Research, building upon accurate
statistical modeling techniques for speech production and perception. It is
offered as a stand-alone solution under Linux and as a Web service.
We offer services to adapt, tune or create specific models or systems
tailored to exactly match your needs. Tailoring models for your application is
the best way to ensure you get the best possible results for your needs and high
accuracy is essential to maximize your ROI.
In addition to our online speech recognition service, we offer services for batch
processing of very large quantities of data such as archives.