Brain-to-speech tech allowed paralysed woman to communicate in her own voice

Neuroscientists in the US have created a device capable of decoding brain activity and translating it into speech so quickly, it almost enables a person who cannot speak to hold conversations.

They’ve tested the device in clinical trial of a 47-year-old woman with quadriplegia who lost the ability to speak after a stroke almost two decades ago.

Already there are brain computer interfaces which decode brain activities and turn them into voice sounds, but there is a substantial delay and the lag – typically of a few seconds – makes holding conversations difficult.

The new deep learning model continually streamed speech in 80 millisecond increments.

The researchers think their technology could one day help patients with speech paralysis communicate more seamlessly.

“Natural spoken communication happens instantaneously,” they wrote in the study published in the journal Nature Neuroscience.

“Speech delays longer than a few seconds can disrupt the natural flow of conversation. This makes it difficult for individuals with paralysis to participate in meaningful dialogue, potentially leading to feelings of isolation and frustration.

“Hence, a practical speech neuroprosthesis must continuously synthesise speech from neural data in tandem with the user’s attempt to speak.”

The researchers, from the University of California (UC) Berkely and UC San Francisco, worked with a participant who could not speak or vocalise intelligible speech due to severe paralysis caused by a brainstem stroke, which occurred 18 years prior.

They implanted an array of electrodes over the surface of the woman’s speech sensorimotor cortex in the brain, which recorded her neural activity while she silently attempted to say complete sentences constructed from a vocabulary of 1,024 words.

“In other words, the participant attempted to ‘mime’ or ‘mouth’ the target sentence without making any vocal sounds,” the authors write.

Examples of these simple sentences include “Would you like that”, “What does she want”, and “Where did you get this”.

“The participant was presented with a text prompt on a monitor and was asked to begin silently attempting to speak once a visual ‘GO’ cue turned green.”

This neural data and the corresponding intended sentence were then used to train a deep learning neural network to decode new neural data into sentences in real time.

“Offline, we also showed that the decoder could operate continuously for several minutes,” the authors write.

“By applying our model to extended blocks of neural activity, we took initial steps toward enabling long-form speech synthesis suitable for daily needs.”

The researchers also used a short voice clip of the participant (recorded before she lost the ability to speak) to alter the default speech synthesiser to mimic her voice, which they say is a “highly desired feature for this participant and others”.

However, they caution the technology requires further improvement before the approach would be clinically viable.

“A limitation of this study is that, although the architecture was shown to be generalisable offline to other participants and datasets, the online demonstrations were conducted with only a single participant,” they write.

“The streaming speech synthesis performance was also lower than what has been demonstrated via text-decoding methods.

“A major effort for researchers will be to continue refining the approach, which will ultimately inform the development of a speech neuroprosthesis suitable for daily use by individuals who cannot speak.”

Sign up to our weekly newsletter

Please login to favourite this article.