Large vocabulary continuous speech recognition, also called
voice-to-text conversion is the
key technology for enabling content-based information access in audio and video
documents. Once automatically processed the linguistic information and metadata
in the structured document are available for further downstream processing,
providing direct access to relevant portions of audio documents. Among the most
common applications of our technology are audio and audiovisual data mining
(broadcast and telephone data), speech analytics, media monitoring, media asset
management, speech transcription and subtitling.
We provide solutions and expertise for core speech processing technologies in
many languages. For example, speech-to-text transcription is available for the
Arabic, Dutch, English, Finnish, French, German, Greek, Italian, Lithuanian, Mandarin,
Polish, Portuguese, Romanian, Russian, Spanish and Turkish languages, with
several others under development. Our language identification module identifies
the spoken language from a set of 50 languages, and clients can create models
for their desired language set. We also work with our clients to adapt, tune or
create specific models or systems tailored to their application needs.
The VoxSigma speech recognition software suite is the latest generation of
transcription software offered by Vocapia Research, building upon accurate
statistical modeling techniques for speech production and perception. The
VoxSigma software suite is offered as a stand-alone solution under Linux and
as a Web service.
We offers services to adapt, tune or create specific models or systems
tailored to exactly match your needs. Tailoring models for your application is
the best way to ensure you get the best possible results for your needs and high
accuracy is essential to maximize your ROI.
In addition to our online speech recognition service, we offer services for batch
processing of very large quantities of data such as archives.