Yes, but the speech
varies greatly depending upon a large number of
factors, including the type of speech (from prepared to spontaneous
speech and conversational speech) and the noise level. So you can
expect very good results when transcribing the speech of an anchor
speaker in a TV or radio news show, but much less good results for
the speech of someone engaged in a very casual conversation.
Yes, the output of the VoxSigma software is an XML file that can be
easily converted into plain punctuated text by discarding additional
information such as word time-codes and word confidence scores.
It depends greatly on the available
language resources for the specific language. It also depends on the type of
speech data you want to process. We are supporting many languages, including
Arabic, Cantonese, Czech, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian,
Italian, Latvian, Lithuanian, Mandarin, Pashto, Persian, Polish, Portuguese,
Romanian, Russian, Spanish, Swahili, Swedish,
Turkish, Ukrainian and Urdu. Contact us
to get a more precise answer
for the languages you are interested in.
Vocapia Research LVCSR
with fully trained language models, so the only information you have
to provide to the system is the language being spoken.
If the language is not known, the language can be identified
automatically (among 20 known languages) by using the VoxSigma language recognition
software. A language identification system identifies the language
being spoken from the speech signal.