Vocapia
Research develops technologies for multilingual, large vocabulary
speech recognition (also called
speech-to-text conversion),
automatic audio segmentation, language identification and speaker
recognition.
Vocapia's
VoxSigma™ speech-to-text software suite delivers
state-of-the-art performance
in many languages for a variety of audio data types, including
broadcast data, parliamentary hearings and conversational speech.
Large vocabulary continuous speech recognition is the key technology for
enabling content-based information access in audio and video documents.
The linguistic information is mostly encoded in the audio channel of audiovisual data,
which once transcribed can be accessed using text-based tools.
Among the most common applications of our technology are audio and audiovisual
data mining (broadcast data, call center data), media monitoring, media asset
management, and speech transcription.
We provide solutions and expertise for core speech processing
technologies such as speech recognition, language identification and
speaker recognition, with speech-to-text technologies for many
languages including Arabic, Dutch, English, French, German, Italian,
Mandarin, Polish, Russian and Spanish, with several others under
development. We work with our clients to adapt, tune or create specific models or systems tailored to their application needs.
The VoxSigma software suite is the latest generation of
speech processing software offered by Vocapia Research, building upon
accurate statistical modeling techniques for speech production and
speech perception.
The VoxSigma software suite is offered as a stand-alone solution and as a Web service.