| Home | About Us | Contact Us | Support | Twitter Linkedin Facebook RSS
Vocapia Logo Leading edge speech processing technology

Online Speech to Text

VoxSigma is available as a Web service via our REST speech-to-text API. The VoxSigma SaaS offers full speech transcription, audio indexing and speech-text alignment capabilities via a REST API over HTTPS allowing customers to quickly reap the benefits of regular improvements to the technology and take advantage of additional features offered by the online environment, such as daily updates of language models. The VoxSigma SaaS is available 24/7/365 with failover servers and geographic redundancy.

The VoxSigma SaaS offers three main processing functions : the identification of the language spoken in an audio document, the conversion of recorded speech input to text, and the synchronization of a transcription with the speech signal (also called speech-text alignment). It handles content in many European languages as well as Mandarin and Arabic.

You can integrate our speech-to-text technology today VoxSigma request Form

VoxSigma SaaS Features

  • Protocol : REST API over HTTPS;
    POST, GET and PUT HTTP methods are accepted;
    Both URI encoded requests and MIME multi-part requests are supported;
    Three submission modes: file, streaming, and real-time.
  • Availability : Service available 24/7/365 with failover servers and geographic redundancy
  • Supported functions : speech-to-text transcription, language identification, speech-text synchronization
  • Supported languages : Arabic, Cantonese, Czech, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Italian, Latvian, Lithuanian, Mandarin, Pashto, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish, Ukrainian and Urdu (more under development)
  • Special features : on the fly language model adaptation, daily updates of language models for broadcast data
  • Audio input : AAC, AIFF, ASF, FLAC, MS-Wave, MPEG, Ogg/Vorbis, Nist Sphere, Sun AU
  • Output : XML data with speaker diarization, language identification tags, word transcription, punctuation, confidence measures, numerical entities and other specific entities
  • Special needs
    • Batch processing offered as an online or offline service to process archives [request form]
    • Model customization is offered on demand to ensure you get the best possible results for your needs [contact form]
SaaS Status

Pricing

  • We offers various usage plans: daily plan, monthly plan, batch plan, ...
  • For our generic systems and large quantities the price is on the order of 0.01 euro (or $0.01) per minute.
  • Note that our pricing is based on speech duration, i.e. silences are not counted and there is no minimum cost per submission.
  • We offer free trials upon request.
  • We no longer offer a pay-as-you-go usage plan. If your data processing needs are relatively low or are irregular, or if you need to process video data or want to manually adapt or correct the automatic transcriptions, please check out our partner's service Yobiyoba. This service pay-as-you-go also offers many export formats such as XML, CSV, SRT, SBV, RTF, VTT, PDF, DOC, DOCX.

Support

We provide hotline support (via email and phone) for our products and services to help users and system integrators solve problems in the shortest possible timeframe [support form].
 
Monday December 02, 2024

© Vocapia Research SAS,
2006-2024. All rights reserved.

Legal Notice   Privacy
About Us
API
Apply for job
Apps
Contact Us
Logos
FAQs
Glossary
News
Publications
Request form
Services
Speech-to-text
STT for Linux
Support
Technologies
Videos
VoxSigma