| Home | About Us | Contact Us | Support | Twitter Linkedin Facebook RSS
Vocapia Logo Leading edge speech processing technology

Orsay - October 31, 2018

Vocapia ranked 1st in the Airbus ATC Challenge

Airbus, in collaboration with IRIT and Safety Data-CFH held a challenge to assess the current state-of-the-art in automatic speech recognition and call sign detection in English Air Traffic Control (ATC) communications. The challenge provided participants with annotated training data and a leaderboard to assess the performance of their systems on a heldout set of development data. ATC communications are challenging for today's technology as the audio is contaminated with various types of noises, and the speech is spoken by a wide range of speakers with different native and non-native accents. The speech is generally in English, but may be in the language spoken in the country (in the case of this challenge, French) and may contain code-switching with English. ATC communications are generally spoken at a fast speech rate and make use of domain specific grammar and vocabulary. There are many potential uses of speech technology in the domain of air traffic communications to improve safety and training.

The Vocapia Research and the Spoken Language Processing Group at LIMSI CNRS submission to the Airbus ATC challenge 2018 ranked first for both the speech recognition and call sign detection tasks. The speech-to-text transcription technology used for the challenge is based on that under-development over the last 20 years, including deep neural networks for both the acoustic and linguistic models. Compared with more general transcription tasks, the Air Traffic Contol communications are at the same time more complicated and simpler: the language is nominally more constratined with a more or less controlled vocabulary and syntax, but the environmental conditions can be quite challenging with various noises and transmission drop out. The call sign detection task requires locating a flight identifier in the automatic transcription. The call sign my be complete, adhering to the full structure (airline code, followed by 3-5 numbers and optionally 1 or 2 letters) or partial. The exchanges between the pilot and control occur in a known context which simplifies the task of understanding partial call signs by humans. However, this contextual information was not available to the automatic systems, complexifying the call sign detection task.

About Vocapia Research

Vocapia Research, founded in July 2000, is an R&D company and software publisher developing and providing leading edge speech technologies and solutions for many languages, including most major European Union languages as well as Arabic, Mandarin, and Russian. The Vocapia Research VoxSigma® software suite uses advanced language technologies such as language identification, speech recognition, and speaker identification to transform raw audio and audiovisual data into structured and searchable XML documents. This technology relies on over 25 years of research at LIMSI-CNRS, with which there is a priviledged partnership. Joint systems developed with LIMSI have achieved top ranks in national and international challenges of speech-to-text transcription. The most common applications of the VoxSigma software suite are audio and audiovisual data mining (broadcast data, podcasts, call center data), media monitoring, and media asset management. Vocapia Research is located in the scientific pole of the Saclay Plateau, France. Readers who wish to get more information about Vocapia Research are invited to check out the Vocapia Research website or use the contact information page http://www.vocapia.com/contact.


Wednesday October 20, 2021

© Vocapia Research SAS,
2006-2020. All rights reserved.

Legal Notice   Privacy
About Us
Apply for job
Contact Us
Request form
STT for Linux