Speech-to-text by Vocapia
Vocapia photo
Home Technology Services Apps News   About Us Contact Us

Leading edge speech processing technology

Vocapia Research develops leading-edge, multilingual speech processing technologies. These technologies include large vocabulary continuous speech recognition, automatic audio segmentation, language identification, speaker recognition and audio-text synchronization. Vocapia's VoxSigma™ speech-to-text software suite delivers state-of-the-art performance in many languages for a variety of audio data types, including broadcast data, parliamentary hearings and conversational data. [REQUEST FORM]

 VoxSigma Software Suite

The VoxSigma software suite provides large vocabulary speech-to-text capabilities in multiple languages, as well as audio segmentation and partitioning, speaker identification and language recognition. The software suite has been designed for professional users needing to transcribe large quantities of audio and video documents such as broadcast data, either in batch mode or in real-time. Versions specifically target the transcription of conversational telephone speech and call-center data. [MORE]
 

  VoxSigma SaaS

VoxSigma is available as a Web service. The VoxSigma SaaS offers full speech-to-text, audio indexing and speech-text alignment capabilities via a REST API over HTTPS allowing customers to quickly reap the benefits of regular improvements to the technology and take advantage of additional features offered by the online environment. The VoxSigma SaaS is available 24/7/365 with failover servers and geographic redundancy. [MORE]

Speech-to-Text Conversion

Large vocabulary continuous speech recognition, also called speech-to-text or voice-to-text conversion is the key technology for enabling content-based information access in audio and video documents. Once automatically processed the linguistic information and metadata in the structured document are available for further downstream processing, providing direct access to relevant portions of audio documents. Among the most common applications of our technology are audio and audiovisual data mining (broadcast and telephone data), speech analytics, media monitoring, media asset management, speech transcription and subtitling.

We provide solutions and expertise for core speech processing technologies in many languages. For example, speech-to-text transcription is available for the Arabic, Dutch, English, French, Finnish, German, Greek, Italian, Mandarin, Polish, Portuguese, Romanian, Russian and Spanish languages, with several others under development. Our language identification module identifies the spoken language from a set of 50 languages, and clients can create models for their desired language set. We also work with our clients to adapt, tune or create specific models or systems tailored to their application needs. [REQUEST FORM]

Some Applications

Broadcast monitoring & audio visual archive indexing   The VoxSigma software suite offers advanced language technologies including speech-to-text transcription, language identification and speaker diarization to transform raw audio data into structured and searchable XML documents, enabling users to access content in video documents. [MORE]

Debate and lecture transcription and indexing   VoxSigma helps reduce the production time and cost to produce transcripts, minutes and/or summaries of public presentations and meetings. VoxSigma also aligns existing transcriptions with audio files, thus significantly enhancing usability. This same alignment technology is used for audiobooks. [MORE]

Telephone Speech Analytics   Vocapia's speech-to-text and language identification software processes telephone data making the recorded calls searchable and analyzable via text-based methods. VoxSigma is used by call management companies and for defense applications. The transcripts are further analyzed and categorized, generating statistics about customer calls. [MORE]

 

Transcription of business conference calls   Vocapia's speech-to-text technology significantly reduces the cost of transcribing business conference calls. The audio document is converted to a fully annotated XML document including speech and non speech segments, speaker labels, words with time codes, high quality confidence scores, as well as punctuation. Vocapia offers services to adapt, tune or create specific models or systems tailored to exactly match the application needs. [MORE]

Video Subtitling   While fully automatic processing generally does not deliver high enough quality subtitles, Vocapia's speaker diarization, speech-to-text transcription and speech-text alignment technologies significantly reduce the effort entailed when closely integrated in the subtitle creation process. [MORE]

Discover More...

The VoxSigma software suite is the latest generation of transcription software offered by Vocapia Research, building upon accurate statistical modeling techniques for speech production and perception. The VoxSigma software suite is offered as a stand-alone solution and as a Web service. [REQUEST FORM]

 
Home | About Us | Contact Us | News | Request form | Support | Quaero | Logos | Publications | Technologies | Services | Speech-to-text | VoxSigma | Apps | Glossary Thursday May 23, 2013
 
© Vocapia Research SAS, 2006-2013. All rights reserved. Legal Notice Follow us: Twitter Linkedin Linkedin RSS