Vocapia at Microsoft challenge on code-switched spoken language identification

Orsay - November 5, 2020

Vocapia at Microsoft challenge on code-switched spoken language identification

As an R&D company, Vocapia research places high important on not only following, understanding and actively contributing to the latest developments in the speech processing field, but also networking and communicating with the international community. One of our recent endeavors was our participation in the first challenge and the ensuing First Workshop on Speech Technologies for Code-switching in Multilingual Communities 2020, held (virtually) just after the annual ISCA Interspeech conference (also held virtually) in Shanghai.

Code-switching in speech, the process of switching from one language to another in the same conversational sequence, is common in bilingual and multilingual communities and has been receiving growing attention in the speech and language communities. The language switch can be just the incorporation of words or short phrases from another language, or can be at a speaker turn or even larger level. The video of our workshop presentation describes our participation in the shared task challenge organized by Microsoft Research on language identification (LID) of code-switched speech in three language pairs: Gujarati-English, Telugu-English and Tamil-English. Vocapia, together with LIMSI-CNRS, have been developing and participating in evaluations of LID technologies for more than a decade. We have extensively explored a variety of approaches including phonotactic and acoustic ones, e.g. [Odyssey 2016] and [InterSpeech 2016]. LIMSI also addressed Code-Switching in French/Algerian Arabic Speech [Interspeech 2017].

Our submissions to the Microsoft challenge were ranked first in both LID tasks: the first (task A) detecting if a given utterance was monolingual or contained code-swithing; and the second (task B) ) producing a frame-level language labeling of code-switched speech.

About Vocapia Research

Vocapia Research is a French R&D company and software publisher with over 20 years of experience in providing leading edge speech technologies for many languages, including most major European languages as well as Arabic, Mandarin, and Russian. The Vocapia Research VoxSigma^® software suite uses advanced language technologies such as language identification, speech recognition, and speaker identification to transform raw audio and audiovisual data into structured and searchable XML documents. This technology relies on decades of research at LISN, with which there is a privileged partnership. Joint systems developed with LISN have achieved top ranks in national and international challenges on speech-to-text transcription. Located at the heart of the science innovation cluster of Paris Saclay, France, Vocapia Research is a leader in developing and adapting AI-based solutions for both civil and defence applications. These applications include audio and audiovisual data mining (broadcast and web data, telephone speech), production of subtitles, OSINT and COMINT, and the analysis of aeronautical communications (air traffic control, voice command). Readers who wish to get more information about Vocapia Research are invited to check out the Vocapia Research website or use the contact information page http://www.vocapia.com/contact.