Large vocabulary continuous speech recognition, also called
speech-to-text or
voice-to-text conversion is the
key technology for enabling content-based information access in audio and video
documents. Once automatically processed the linguistic information and metadata
in the structured document are available for further downstream processing,
providing direct access to relevant portions of audio documents. Among the most
common applications of our technology are audio and audiovisual data mining
(broadcast and telephone data), speech analytics, media monitoring, media asset
management, speech transcription and subtitling.
We provide solutions and expertise for core speech processing technologies in
many languages. For example, speech-to-text transcription is available for
the Arabic, Dutch, English, French, Finnish, German, Greek, Italian, Mandarin,
Polish, Portuguese, Romanian, Russian and Spanish languages, with several
others under development. Our language identification module identifies the
spoken language from a set of 50 languages, and clients can create models for
their desired language set. We also work with our clients to adapt, tune or create
specific models or systems tailored to their application needs.
[REQUEST FORM]
The VoxSigma software suite is the latest generation of transcription
software offered by Vocapia Research, building upon accurate statistical
modeling techniques for speech production and perception. The VoxSigma software
suite is offered as a stand-alone solution and as a Web service.
[REQUEST FORM]