- Could you please start by telling us a bit about IARPA and the origin of this Challenge?
The development of automatic speech recognition able to perform well across a variety of acoustic environments and recording scenarios on natural conversational speech represents one of the biggest challenges in speech recognition research and development. Previous work in the literature has shown that automatic speech recognition (ASR) performance degrades on microphone recordings especially when data used for training is mismatched with data used in testing. The ASpIRE (Automatic Speech recognition In Reverberant Environments) Challenge seeks to foster the development of innovative speech recognition systems that can be trained on conversational telephone speech, yet work well on far-field microphone data from noisy, reverberant rooms. Challenge “Solvers” are given access to sample data against which they can test their algorithms that are different from the test set, but provide a good representation of microphone recordings in real rooms. Solvers will have the opportunity to evaluate their techniques on a common and challenging test set that includes significant room noise and reverberation. With ASpIRE, IARPA is continuing to address its mission to promote high-risk, high-payoff research that has the potential to enhance the performance of IC activities. IARPA’s use of a challenge to stimulate breakthroughs in science and technology also supports the White House’s Strategy for American Innovation, as well as government transparency and efficiency.
- What are you hoping to achieve with this Challenge and can you describe the impact of a successful solution?
The purpose of this challenge is to gauge how far recent advances in speech recognition have come in solving this important problem and drive further creative innovation in an exciting way. With broad participation, this challenge has the potential to provide IARPA with insights on the best next steps to stimulate research for solving this challenging problem.
- What was your motivation for crowdsourcing this Challenge?
The reason for crowdsourcing the challenge is to invite the broadest possible community of innovators to demonstrate their technical insights and ingenuity in addressing automatic speech recognition in reverberant environments in order to identify the leading systems and Solvers.
- What are the key attributes you’d like to see (or not see) in a winning solution?
We encourage solvers to try a wide variety of new methods that have yet to be tested on challenging data, not just standard techniques. This challenge offers the opportunity to test out new and emerging ideas in a way that can be compared to a wide variety of solvers in a fair and convincing way.
- Any final advice or guidance for our Solvers as they tackle this challenge?
We encourage solvers to take advantage of the development and development-test data to gauge progress, but not to over-tune to this data, because the evaluation test set recordings differ significantly from the development data. The evaluation data will test robustness of your solutions.