I know that I have always taken it for granted. It’s something I was born with like most people except that I seem to have been given an overabundance of it. In fact it got me in trouble in my early years, but provided me with a job that started over 50 years ago. I consider myself a blabbermouth who was born to talk. When my parents adopted me at about two years of age they said I ran around a lot and wouldn’t stop talking. Many years ago when I had pneumonia the doctor ordered me not to utter a sound for a week due to having serious inflammation of my vocal cords. He told me I could permanently damage my voice if I kept taking. I was a radio DJ and was the 11 pm weatherman for their television station. Not being able to communicate with my voice gave me the longest 7 days I have spent in my entire life. That’s why I decided to write about a new technology that has the potential give people who can’t speak the ability to have people see and even hear their words.
What brought this topic to my attention is an article on medicalxpress.com titled “Synthetic speech generated from brain recordings.” The article was from the University of California at San Francisco (UCSF) and it describes the research performed in the laboratory of Edward Chang, MD at UCSF. The research study was published in the publication Nature. The research shows how they created a synthesized version of a person’s voice that can be controlled by the activity of their brain’s speech center. According to Dr. Chang” For the first time, this study demonstrates that we can generate entire spoken sentences based on an individual’s brain activity. This is an exhilarating proof of principle that with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.”
The step that is missing is how to get the the brain activity translated into the movements of the lips, tongue and jaw to actually form words. According to the article a research team headed by a speech scientist Gopala Anumanchipalli, Ph.D. and a bioengineering graduate student Josh Cartier who worked in Dr. Chang’s laboratory The article quoted Dr. Anumanchipalli “The relationship between the movement of the vocal tract and the speech sounds that are produced is a complicated one. We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals.”
For this research they chose 5 volunteer subjects from the UCSF Epilepsy Center all of whom had the ability to speak.
They were fitted with electrodes temporarily implanted their brain into map the source of their seizures before their planned neurosurgery. They were asked to read aloud several hundred sentences and the researchers recorded the brain activity in the area where language is produced. According to the article the researchers used the recordings of their voices to reverse engineer the vocal tract movements needed to make the sounds. They determined the physical movements needed for “the pressing of the lips together here, tightening the the vocal cords there, shifting the tip of the tongue to the roof of the mouth, then relaxing it, and so on.”
Quoting bioengineering graduate student Josh Cartier “we still have a ways to go to perfectly mimic spoken language. We’re quite good at synthesizing slower speech sounds like ‘sh’ and ‘z’ as well as maintaining the rhythms of and intonations of speech and the speakers gender and identity. but some of the more abrupt sounds like ‘b’s’ and ‘p’s’ get a bit fuzzy. Still, the levels of accuracy we produced here would be an amazing improvement in real-time communication compared to what’s currently available.”
“The detailed mapping of sound to anatomy allowed the scientists to create a realistic virtual vocal tract for each participant that could be controlled by their brain activity. This is comprised of two “neural network” machine learning algorithms: a decoder that transforms brain activity produced by speech into movements of the virtual vocal tract, and a synthesizer that converts these vocal tract movements into a synthetic approximation of the participants voice.” By using subjects who could speak the researchers could, as mentioned before reverse engineer what brain activity created it. The result could be the ability to use these same principles to allow those without the ability to speak to generate an actual mechanical voice controlled by their own brain. It would have to work without the extra step of recording the person’s own voice to make the algorithm function since the person wouldn’t have the ability to speak at all. Their research continues.
Let me know what you would like me to talk about or explain. You can comment below or email me at: [email protected].