Scientists turn brain signals into speech

Technology that harnesses brain activity to produce synthesized speech might one day give voice to people who have none due to stroke or ALS, researchers say.

Method synthesizes a person's speech using the brain signals related to movements of vocal tract

Thomson Reuters · Posted: Apr 24, 2019 5:18 PM EDT | Last Updated: April 24, 2019

Study author Gopala Anumanchipalli experimented with an intracranial electrode of the type used to record brain activity. (University of California San Francisco/Nature)

People robbed of the ability to talk due to a stroke or another medical condition may soon have real hope of regaining a voice thanks to technology that harnesses brain activity to produce synthesized speech, researchers said on Wednesday.

Scientists at the University of California, San Francisco, implanted electrodes into the brains of volunteers and decoded signals in cerebral speech centres to guide a computer-simulated version of their vocal tract — lips, jaw, tongue and larynx — to generate speech through a synthesizer.

This speech was mostly intelligible, though somewhat slurred in parts, raising hope among the researchers that, with some improvements, a clinically viable device could be developed in the coming years for patients with speech loss.

"We were shocked when we first heard the results — we couldn't believe our ears. It was incredibly exciting that a lot of the aspects of real speech were present in the output from the synthesizer," said study co-author and UCSF doctoral student Josh Chartier. "Clearly, there is more work to get this to be more natural and intelligible but we were very impressed by how much can be decoded from brain activity."

Stroke, ailments such as cerebral palsy, amyotrophic lateral sclerosis (ALS), Parkinson's disease and multiple sclerosis, brain injuries and cancer sometimes take away a person's ability to speak.

We are still working on making the synthesized speech crisper and less slurred.- Josh Chartier

Some people use devices that track eye or residual facial muscle movements to laboriously spell out words letter by letter, but producing text or synthesized speech this way is slow, typically no more than 10 words per minute. Natural speech is usually 100 to 150 words per minute.

Virtual vocal tract created

The five volunteers, all capable of speaking, were given the opportunity to take part because they were epilepsy patients who already were going to have electrodes temporarily implanted in their brains to map the source of their seizures before neurosurgery. Future studies will test the technology on people who are unable to speak.

The volunteers read aloud while activity in brain regions involved in language production was tracked. The researchers discerned the vocal tract movements needed to produce the speech and created a "virtual vocal tract" for each participant that could be controlled by their brain activity and produce synthesized speech.

Stephen Hawking is seated in his wheelchair in front of a podium. — In this March 6, 2017 photo, the late British Professor Stephen Hawking delivers a keynote speech as he receives the Honorary Freedom of the City of London. Hawking had ALS and used a speech-generating device. (Matt Dunham/Associated Press)

"Very few of us have any real idea, actually, of what's going on in our mouth when we speak," said neurosurgeon Edward Chang, senior author of the study published in the journal Nature. "The brain translates those thoughts of what you want to say into movements of the vocal tract, and that's what we're trying to decode."

The researchers were more successful in synthesizing slower speech sounds like "sh" and less successful with abrupt sounds like "b" and "p." The technology did not work as well when the researchers tried to decode the brain activity directly into speech without using a virtual vocal tract.

Many challenges remain

"We are still working on making the synthesized speech crisper and less slurred. This is in part a consequence of the algorithms we are using, and we think we should be able to get better results as we improve the technology," Chartier said.

"We hope that these findings give hope to people with conditions that prevent them from expressing themselves; that one day we will be able to restore the ability to communicate, which is such a fundamental part of who we are as humans," he added.

The results offer early proof for synthesizing speech with a brain-computer interface (BCI), both in terms of the accuracy of the audio reconstructions and how well listeners could classify the words and sentences produced, Chethan Pandarinath and Yahia Ali said in an accompanying journal commentary.

"However, many challenges remain on the path to a clinically viable speech BCI. The intelligibility of the reconstructed speech was still much lower than that of natural speech," Pandarinath and Ali wrote.

"These compelling proof-of-concept demonstrations of speech synthesis in individuals who cannot speak, combined with the rapid progress of BCIs in people with upper-limb paralysis, argue that clinical studies involving people with speech impairments should be strongly considered.

However, they noted with continued progress, they hope that individuals with speech impairments will regain the ability to freely speak their minds and reconnect with the world around them."

Dr. Thomas Oxley director of innovation strategy with the Mount Sinai Hospital Health System's department of neurosurgery in New York City, has some concerns. The ability of AI to read a person's brain "raises significant ethical issues around privacy and security that research leaders need to be cognizant of," he said.

Still, the breakthrough "is a further, important step along the pathway of decoding the brain patterns that underlie thinking," Oxley stressed.

With files from CBC News and HealthDay News

CBC's Journalistic Standards and Practices·About CBC News

Corrections and clarifications·Submit a news tip·