AI Generate Facial Reconstruction Just By Listening to People's Voices

Facial reconstruction based solely on someone's voice is now possible. A neural network called Speech2Face, was trained by scientists on millions of educational videos from the internet that showed over 100,000 different people talking. Researchers claimed that from this, the AI learned associations between vocal cues and certain physical features in a human face after which it then used an audio clip to model a photorealistic face matching the voice.

The details of the study were recently published online in the preprint journal arXiv, and although have not been peer-reviewed, it is causing much stir online.

The researchers of the said study stated that the AI doesn't (yet) know exactly what a specific individual looks like based on their voice alone. The neural network recognized certain markers in speech that pointed to gender, age and ethnicity, features that are shared by many people, unless it learns eventually of course.

"As such, the model will only produce average-looking faces," the scientists wrote. "It will not produce images of specific individuals.

"Although the faces generated by Speech2Face - as shown in the picture above, all facing front and with neutral expressions - didn't precisely match the people behind the voices. Its facial reconstruction did usually capture the correct age ranges, ethnicities and genders of the individuals, according to the study.

Given that the technology is still in its early stages, it is far from perfect. It showed "mixed performance" when given samples that have language variations. When the algorithm was made to listen to a single person speaking Chinese the first time and English the second time, it was not able to determine that both voices were of the same man and instead generated two different faces, one Asian and the second a white man. Besides this, it also showed gender bias by associating low-pitched voices with male faces and high-pitched voices with female faces; like the researchers said, not perfect.

The researchers further stated that given that the AI was only trained based on Youtube videos it "does not represent equally the entire world population".

On the other hand, there are some who are concerned that their videos on youtube are being used as a dataset given to AI for training. Such as what happened to Nick Sullivan, from Cloudflare in San Francisco, when he unexpectedly spotted his face as one of the examples used to train Speech2Face. As of this moment, YouTube videos are widely considered to be available for researchers to use without acquiring additional permissions.

AI Generate Facial Reconstruction Just By Listening to People's Voices

Most Popular

Trump Administration Declares COVID-19 Likely Originated from Wuhan Lab Leak, Citing Scientific Evidence

Tesla Cybertruck Crashes Anti-ICE Protests in LA, Becomes Unlikely Symbol of Trump Controversy

Elon Musk Claims Tesla Robotaxi Will Hit Streets This Month: 'Most Important Product' Yet

Google Earthquake Detection Comes to Wear OS Watches; Life-Saving Alerts Now on Your Wrist

How Much Water and Energy Does ChatGPT Use? Sam Altman Breaks Down the Numbers

Latest Stories

Google Earthquake Detection Comes to Wear OS Watches; Life-Saving Alerts Now on Your Wrist

Elon Musk Claims Tesla Robotaxi Will Hit Streets This Month: 'Most Important Product' Yet

Tesla Cybertruck Crashes Anti-ICE Protests in LA, Becomes Unlikely Symbol of Trump Controversy

How Much Water and Energy Does ChatGPT Use? Sam Altman Breaks Down the Numbers

Recommended Stories

Voyager 2’s Historic Uranus Flyby May Have Captured Rare Event, Changing Scientists’ View of the Planet

Is the Ozone Layer Repairing Itself? Scientists Think So

SpaceX Dragon Successfully Docks With ISS, Delivering 6,000 Pounds of Supplies

Colorectal Cancer Deaths Increasing Among Millennials and Gen X: Learn the Warning Signs

AI Generate Facial Reconstruction Just By Listening to People's Voices

Most Popular

Latest Stories

Subscribe to The Science Times!

Recommended Stories