Boffins at the University of Illinois Urbana-Champaign (UIUC) in the US are working with the usual internet super-corps to, ideally, improve AI voice recognition for people with disabilities.
Speech recognition software often struggles to process speech for people with heavy accents, and performs even worse for people with speech disabilities, since their voices are usually not represented well or at all in training datasets.
The Speech Accessibility Project, launched on Monday and supported by Amazon, Apple, Google, Meta, and Microsoft, as well as nonprofit organizations, aims to make speech recognition models more effective for everyone. “For many of us, speech and communication are effortless,” Clarion Mendes, a clinical professor in speech and hearing science at UIUC working on the project, told The Register.
“However, there are millions of people for whom communication is not effortless. It’s a daily struggle. By unifying our efforts toward a common goal of improving speech accessibility for individuals with speech disabilities or differences, we’re not just improving technology – we’re improving quality of life and promoting independence.”
Researchers will focus on obtaining diverse audio data from people affected by various medical disorders that impact speech, such as Lou Gehrig’s disease or amyotrophic lateral sclerosis (ALS), Parkinson’s, cerebral palsy, and Down syndrome speaking English. Volunteers will be paid to record audio samples, which will be used to create a large dataset to train AI models for commercial and research applications.
If there are, or have been, projects similar to this effort, that’s great, though this one stands out for its support from those making today’s AI voice assistants and the like.
Industry partners supporting the Speech Accessibility Project are funding the project for two years at least, and will work with academics to figure out how current speech recognition models can be improved.
“Through working directly with individuals with speech differences and disabilities, via focus groups and our advocacy partners, we’ll be equipped to determine the strengths and limitations of current automatic speech recognition systems and the need for developing novel systems,” Mendes said.
The team will be working with the Davis Phinney Foundation and Team Gleason, two non-profits to gather speech data from people with ALS and Parkinson’s disease at first before expanding to support other types of disabilities.
“The option to communicate and operate devices with speech is crucial for anyone interacting with technology or the digital economy today. Speech interfaces should be available to everybody, and that includes people with disabilities,” said Mark Hasegawa-Johnson, the UIUC professor of electrical and computer engineering leading the project.
“This task has been difficult because it requires a lot of infrastructure, ideally the kind that can be supported by leading technology companies, so we’ve created a uniquely interdisciplinary team with expertise in linguistics, speech, AI, security, and privacy to help us meet this important challenge.” ®