New speed and accuracy records have been set for brain-computer interfaces (BCI) to decode and translate brain activity into speech.
Such technologies are crucial in attempts to restore communication to people with severe paralysis.
Two papers published in Nature detail efforts by teams led by Dr Francis Willett at Stanford University and Dr Edward Chang from the University of California, San Francisco (UCSF) to produce faster and more accurate brain-computer interfaces.
One project involved inserting electrodes into the brain, the other sits the electrodes on top.
Willett and colleagues developed a BCI that collects the neural activity of single cells using fine electrodes inserted into the brain. An artificial neural network was trained to decode those brain signals into the intended vocalisations.
The BCI enabled a participant, Pat – who can no longer speak clearly due to amyotrophic lateral sclerosis (ALS; a rare neurological disease that affects motor neurons) – to communicate at 62 words per minute, 3.4 times faster than the previous record for a similar device. Natural conversation is about 160 words per minute.
Willett’s team’s BCI achieved a 9.1% word error rate on a 50-word vocabulary – 2.7 times fewer errors than the previous state-of-the-art speech BCI developed in 2021. A 125,000-word vocabulary saw a 23.8% word error rate.
“People with neurological disorders such as brainstem stroke or ALS frequently face severe speech and motor impairment and, in some cases, complete loss of the ability to speak (locked-in syndrome),” Willett and his co-authors write.
“Our demonstration is a proof of concept that decoding attempted speaking movements with a large vocabulary is possible using neural spiking activity. However, it is important to note that it does not yet constitute a complete, clinically viable system.”
Chang’s team used non-penetrating electrodes that sit on the brain’s surface, detecting activity from many cells across regions covering the entire speech cortex. The BCI decodes these signals into three outputs simultaneously: text, audible speech, and a speaking avatar which also decoded the participant’s brain activity into facial movements. The speech is also personalised to the participant’s pre-injury voice.
The 47-year-old participant of the study is a stroke victim diagnosed with quadriplegia and anarthria.
A deep-learning model trained to decipher the neural data generated brain-to-text speeds of 78 words per minute, which is 4.3 times as fast as the previous record.
The BCI achieved a 4.9% word error rate from a 50-phrase set – 5 times fewer errors than the previous record. Word error was 25% when real-time decoding sentences with a 1,000+ word vocabulary. Offline simulations showed a 28% word error rate using a more than 39,000-word vocabulary.
“Faster, more accurate, and more natural communication are among the most desired needs of people who have lost the ability to speak after severe paralysis,” Chang’s team write.
“A limitation of the present proof-of-concept study is that the results shown are from only one participant. An important next step is to validate these decoding approaches in other individuals with varying degrees and etiologies of paralysis.”