Researchers taught robots to run. Now they’re teaching them to walk

We’ve all seen videos over the past few years demonstrating how agile humanoid robots have become, running and jumping with ease. We’re no longer surprised by this kind of agility—in fact, we’ve grown to expect it.

The problem is, these shiny demos lack real-world applications. When it comes to creating robots that are useful and safe around humans, the fundamentals of movement are more important. As a result, researchers are using the same techniques to train humanoid robots to achieve much more modest goals.

Alan Fern, a professor of computer science at Oregon State University, and a team of researchers have successfully trained a humanoid robot called Digit V3 to stand, walk, pick up a box, and move it from one location to another. Meanwhile, a separate group of researchers from the University of California, Berkeley, have focused on teaching Digit to walk in unfamiliar environments while carrying different loads, without toppling over. Their research is published in a paper in Science Robotics today.

Both groups are using an AI technique called sim-to-real reinforcement learning, a burgeoning method of training two-legged robots like Digit. Researchers believe it will lead to more robust, reliable two-legged machines capable of interacting with their surroundings more safely—as well as learning much more quickly.

Sim-to-real reinforcement learning involves training AI models to complete certain tasks in simulated environments billions of times before a robot powered by the model attempts to complete them in the real world. What would take years for a robot to learn in real life can take just days thanks to repeated trial-and-error testing in simulations.

A neural network guides the robot using a mathematical reward function, a technique that rewards the robot with a large number every time it moves closer to its target location or completes its goal behavior. If it does something it’s not supposed to do, like falling down, it’s “punished” with a negative number, so it learns to avoid these motions over time.

In previous projects, researchers from the University of Oregon had used the same reinforcement learning technique to teach a two-legged robot named Cassie to run. The approach paid off—Cassie became the first robot to run an outdoor 5K before setting a Guinness World Record for the fastest bipedal robot to run 100 meters and mastering the ability to jump from one location to another with ease.

Training robots to behave in athletic ways requires them to develop really complex skills in very narrow environments, says Ilija Radosavovic, a PhD student at Berkleley who trained Digit to carry a wide range of loads and stabilize itself when poked with a stick. “We’re sort of the opposite—focusing on fairly simple skills in broad environments.”

This new wave of research in humanoid robotics is less concerned with speed and ability, and more focused on making machines robust and able to adapt—which is ultimately what’s needed to make them useful in the real world. Humanoid robots remain a relative rarity in work environments, as they often struggle to balance while carrying heavy objects. This is why most robots designed to lift objects of varying weights in factories and warehouses tend to have four legs or larger, more stable bases. But researchers hope to change that by making humanoid robots more reliable using AI techniques.

Reinforcement learning will usher in a “new, much more flexible and faster way for training these types of manipulation skills,” Fern says. He and his team are due to present their findings at ICRA, the International Conference on Robotics and Automation, in Japan next month.

The ultimate goal is for a human to be able to show the robot a video of the desired task, like picking up a box from one shelf and pushing it onto another higher shelf, and then have the robot do it without requiring any further instruction, says Fern.

Getting robots to observe, copy, and quickly learn these kinds of behaviors would be really useful, but it still remains a challenge, says Lerrel Pinto, an assistant professor of computer science at New York University, who was not involved in the research. “If that could be done, I would be very impressed by that,” he says. “These are hard problems.”