They had a robot watch hundreds of hours of YouTube videos, and it ended up learning to talk and sing without anyone programming it

Published On: March 10, 2026 at 12:30 PM

Humanoid robot EMO with a silicone face designed to learn speech and lip movements by watching human videos.

What happens if you sit a humanoid robot in front of a mirror, then let it binge watch hours of YouTube clips of people talking and singing. For researchers at Columbia Engineering, the result is a machine that can move its lips in sync with human speech in a way that feels surprisingly natural.

The work suggests that careful observation can, to a large extent, replace hand written rules when robots learn complex gestures linked to language.

The robot, called EMO, is a soft-faced head packed with 26 tiny motors hidden under a silicone skin. Instead of being told exactly how to shape every word, EMO learns from trial and error, from watching its own reflection, and from studying people in online videos.

The team describes the approach in the journal Science Robotics and sees it as a step toward robots that communicate with faces as well as voices.

From mirror practice to a new kind of robot learning

The training starts with something close to baby talk. EMO is placed in front of a mirror and runs thousands of random motor commands while a camera records how its blue silicone face moves. Over time, the system builds an internal map that links each combination of motor signals to a specific facial shape.

Researchers describe this as a vision-to-action model, which sounds abstract but is easy to picture. The robot learns that a certain pattern of motors lifts the corners of the mouth, while another tightens the lips. In practical terms, that means EMO can decide which internal muscle pattern to use whenever it wants to match a target expression it sees.

Watching YouTube to match speech and song

Once EMO understands its own face, the team moves on to human examples. The robot watches hours of YouTube videos of people speaking and singing, in English and in many other languages, while its AI lines up sounds with detailed mouth movements frame by frame. The system gradually learns which mouth shapes go with different syllables, vowels, and consonants.

Why a soft face and flexible lips matter

Under EMO’s smooth skin sit 26 actuators that can move parts of the face independently, including lips with multiple degrees of freedom rather than a single clacking jaw.

That design lets the robot form subtle shapes that cover 24 consonants and 16 vowels, far beyond the simple open and close motion that makes many robots look like animated puppets. The goal is not only accuracy, but also to soften the “uncanny” feeling people get from stiff mechanical faces.

EMO has already starred in earlier research on human robot facial coexpression, where it learned to predict a human smile almost a second before it appears and mirror it in real time. That work showed how important timely, expressive faces can be for building trust in settings such as health care, education, or customer service.

The new study extends that idea from pure emotion into the messy, fast-changing world of spoken language.

Researcher adjusting the silicone face of EMO humanoid robot designed to learn speech and lip movements through AI training. — A researcher at Columbia Engineering works with the EMO robot, which learns realistic lip movements by watching hours of human speech videos.

From lab demo to singing, talking companions

The project is led by PhD researcher Yuhang Hu at Creative Machines Lab, together with professor Hod Lipson at Columbia University.

Lipson argues that much of robotics has focused on legs and hands, while faces have been neglected even though humans rely heavily on facial cues. In his words, “something magical happens when a robot learns to smile or speak just by watching and listening to humans,” and that magic could make interactions feel less like talking to a talking speaker on a stick.

Read More: A European country is already analyzing the brain waves of its fighter pilots, with the aim of ensuring that they do not let their guard down for a single second

Read More: They reconstruct the Jurassic ecosystem and discover that giant baby dinosaurs left to fend for themselves were the favorite prey of large predators.

Sonia Ramírez

Related news

Leave a Comment Cancel reply

A new study claims that we may be underestimating billions of people on the planet, and no one had noticed for more than 40 years

Russia tests a white suit with black spots nicknamed “penguin” in Ukraine, and within hours at least two soldiers are located and neutralized by FPV drones in the middle of a snowy landscape

Elon Musk admits in February 2026 that convincing married engineers to relocate to Starbase, 40 minutes from Brownsville and near the Mexican border, has become his biggest silent problem

Brazil is entering the final stretch of a historic milestone and promises to complete the first F-39 Gripen assembled in the country by the end of March 2026, a step that places it in the select club of supersonic fighter jet manufacturers

A European country is already analyzing the brain waves of its fighter pilots, with the aim of ensuring that they do not let their guard down for a single second

Colorado scientists discover that cannabis could benefit memory and brain size in older people

In February 2026, Ukraine converts a twin-engine Antonov An-28 regional aircraft into a makeshift drone hunter, mounts an M134 rotary cannon on the side door, and sends it out to chase Russian drones like something out of a low-budget movie

They had a robot watch hundreds of hours of YouTube videos, and it ended up learning to talk and sing without anyone programming it

From mirror practice to a new kind of robot learning

Watching YouTube to match speech and song

Read More: A European country is already analyzing the brain waves of its fighter pilots, with the aim of ensuring that they do not let their guard down for a single second

Why a soft face and flexible lips matter

Read More: She takes his dog to the vet to have a tooth removed, and what he does when he wakes up from the anesthesia has racked up 7.2 million views

From lab demo to singing, talking companions

Read More: They reconstruct the Jurassic ecosystem and discover that giant baby dinosaurs left to fend for themselves were the favorite prey of large predators.

Related news

Leave a Comment Cancel reply

Latest news

Categories

Quakes Links

Follow Us On