Surprisingly, but yet to a limited extent, an artificial intelligence can decipher words and sentences from brain activity. The AI makes an estimate as to what a person has heard based on a few seconds’ worth of brain activity data. In a preliminary investigation, researchers discovered that it up to 73 percent of the time lists the right response among its top 10 options. Giovanni Di Liberto, a computer scientist at Trinity College Dublin who was not engaged in the project, claims that the AI’s “performance was above what many people felt was conceivable at this time.”
Researchers say on August 25 at arXiv.org that AI created by Facebook’s parent company, Meta, could potentially be used to assist thousands of people worldwide who are unable to communicate through speech, typing, or gestures. This includes a large number of patients who are in “vegetative states,” which are minimally conscious states that are locked in (SN: 2/8/19).
Most of the available communication aids for these children involve dangerous brain procedures to install electrodes. According to neuroscientist Jean-Rémi King, a Meta AI researcher currently at the École Normale Supérieure in Paris, this novel technology “may give a viable avenue to aid patients with communication difficulties… without the use of invasive procedures.”
King and his coworkers used 56,000 hours of audio recordings from 53 different languages to train a computational tool to recognize words and sentences. The tool, usually referred to as a language model, developed the ability to recognize particular linguistic elements at both a fine-grained level (think letters or syllables) and a more general level (think word or phrase).
The group used databases from four institutions, which contained brain activity from 169 participants, to train an AI using this language model. Participants in these databases listened to various passages and stories while having their brains scanned using either magnetoencephalography or electroencephalography. These passages and stories included passages from Lewis Carroll’s Alice in Wonderland and Ernest Hemingway’s The Old Man and the Sea. These methods assess the electrical or magnetic component of brain impulses.
The scientists next attempted to decipher what participants had heard using just three seconds of brain activity data from each participant using a computational method that helps account for physical variances among actual brains. The scientists gave the AI instructions to match speech sounds from the tale recordings to brain activity patterns that the AI calculated would match what people were hearing. Then, based on more than 1,000 scenarios, it predicted what the listener would have been hearing during that brief period.
The accurate response was in the AI’s top 10 guesses up to 73% of the time, the researchers discovered using magnetoencephalography, or MEG. When using electroencephalography, that percentage was no higher than 30%. Di Liberto declares, “[That MEG] performance is really good,” but he is less certain about its actual use. What can we use it for? Nothing. There is nothing. The issue, he explains, is that MEG requires a huge and expensive machine. It will take technological advances that reduce the cost and complexity of the equipment before this technology can be used in clinics.
According to Jonathan Brennan, a linguist at the University of Michigan in Ann Arbor, it’s also critical to comprehend what “decoding” in this study actually entails. The process of extracting information directly from a source—in this case, speech from brain activity—is often referred to by this term. But the AI could only do this because it was given a limited number of potential right answers from which to choose when making its assumptions. Because language is infinite, Brennan asserts, “that won’t do with language if we wish to scale to practical application.”
Additionally, according to Di Liberto, the AI decoded data from subjects who were passively listening to audio, which is not immediately applicable to patients who are nonverbal. Scientists must figure out how to decipher from brain activity what these patients are trying to say, such as indications of hunger or discomfort or a simple “yes” or “no,” for it to be a useful communication tool. King believes that the focus of the new study is “decoding of speech perception, not creation.” The ultimate aim may be speech creation, but for the time being, “we’re quite a long way away.”