In a remarkable application of modern technology, scientists at the University of California have created a device that can decode signals from the brain’s speech centers to produce speech through a synthesizer. The device was featured in a paper entitled “Speech synthesis from neural decoding of spoken sentences”, which was published yesterday in the journal Nature.
Gorodenkoff | Shutterstock
For most people, speech is their main form of communication and a way of expressing how they feel. Losing the ability to speak through trauma or disease, such as stroke, Parkinson’s disease, throat cancer, motor neuron disease, can be devastating for both patients and their families.
Although devices have been developed to help people recreate speech by tracking eye or facial muscle movements, the process is laborious for users and the synthesized speech is extremely slow. Such devices typically produce up to ten words a minute compared to the 150 words a minute possible with natural speech.
Turning brain signals into speech
Normally, our thoughts are converted by the brain into movements of the lips, jaw, tongue, and larynx to enable us to produce the required words. Now, scientists have used these brain signals to create a computer-simulated vocal tract that can generate speech through a synthesizer.
The technology was tested in five volunteers with epilepsy who were able to speak normally. The volunteers were all due to have electrodes temporarily implanted in their brains to map the source of their seizures before undergoing corrective neurosurgery. This meant that the experimental speech synthesis device could be tested without the need for an additional invasive procedure.
Activity in the regions of the brain involved in language production was tracked as the volunteers read out hundreds of sentences aloud. The recorded signals were then used to produce a “virtual vocal tract” for each participant that simulated the movements in the mouth and throat needed to form different sounds. This was then controlled by the volunteer’s brain activity and instructed a synthesizer to generate speech.
The researchers were ‘impressed’
The resultant speech was mostly intelligible, with listeners being able to discern what was being said up to 70 percent of the time. The speech was slurred in parts and some sounds were not pronounced correctly, but the scientists are confident that they can improve on this by optimizing the algorithms they used.
The technology did not work as well when the researchers tried to decode the brain activity directly into speech, without using a virtual vocal tract.
Co-author of the research Josh Chartier commented: “Clearly, there is more work to get this to be more natural and intelligible but we were very impressed by how much can be decoded from brain activity.”
We hope that these findings give hope to people with conditions that prevent them from expressing themselves that one day we will be able to restore the ability to communicate, which is such a fundamental part of who we are as humans”.