More about us: Vocapia Research develops leading-edge, multilingual speech processing
                                        technologies exploiting AI methods such as machine learning. These
                                        technologies enable unlimited vocabulary
                    
speech recognition,
                                        automatic audio segmentation, 
language
                                                                                        identification,
                                        speaker diarization and audio-text synchronization. Vocapia's VoxSigma™
                    
speech-to-text software suite
                                        delivers state-of-the-art performance in over 30 languages and dialects
                                        for a variety of audio
                                        data types, including broadcast data, parliamentary hearings, conference
                                        calls, or phone
                                        conversations.
                    
The 
VoxSigma™
                                                                                     software suite is available for
                                        on-site licensing and as a
                    
web service. Designed for professional users
                                        needing to process large
                                        quantities of audio and video documents with support for multichannel
                                        and
                                        multilingual documents. We offer customization services to tailor our
                                        solutions to the most demanding use cases.
                    
                                        Speech recognition, also called
                    
speech-to-text or
                    
voice-to-text conversion is the
                                        key technology for enabling content-based information access in audio
                                        and video
                                        documents. Once automatically processed the linguistic information and
                                        metadata
                                        in the structured document are available for further downstream
                                        processing,
                                        providing direct access to relevant portions of audio documents. Among
                                        the most
                                        common applications of our technology are audio and audiovisual data
                                        mining
                                        (broadcast and telephone data), speech analytics, media monitoring,
                                        media asset
                                        management, speech transcription and subtitling.
                    
                                        We provide solutions and expertise for core speech processing
                                        technologies in
                                        many languages. For example, speech to text transcription is available
                                        for the
                                        Arabic, Cantonese, Czech, Dutch, English, Finnish, French, German,
                                        Greek, Hebrew, Hindi,
                                        Hungarian, Italian, Latvian, Lithuanian, Mandarin, Pashto, Persian,
                                        Polish,
                                        Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish,
                                        Ukrainian and Urdu
                                        languages, with several others under development. Our language
                                        identification
                                        module identifies the spoken language from a set of 100 languages and
                                        dialects, and clients
                                        can create models for their desired language set.
                    
                                        We offer services to adapt, tune or create specific models or systems
                                        tailored to exactly match your needs. Tailoring models for your
                                        application is
                                        the best way to ensure you get the best possible results for your needs
                                        and high
                                        accuracy is essential to maximize your ROI.
                                        In addition to our online speech recognition service, we offer services
                                        for batch
                                        processing of very large quantities of data such as archives.