AI models let robots carry out tasks in unfamiliar environments

It’s tricky to get robots to do things in environments they’ve never seen before. Typically, researchers need to train them on new data for every new place they encounter, which can become very time-consuming and expensive.

Now researchers have developed a series of AI models that teach robots to complete basic tasks in new surroundings without further training or fine-tuning. The five AI models, called robot utility models (RUMs), allow machines to complete five separate tasks—opening doors and drawers, and picking up tissues, bags, and cylindrical objects—in unfamiliar environments with a 90% success rate.

The team, consisting of researchers from New York University, Meta, and the robotics company Hello Robot, hopes its findings will make it quicker and easier to teach robots new skills while helping them function within previously unseen domains. The approach could make it easier and cheaper to deploy robots in our homes.

“In the past, people have focused a lot on the problem of ‘How do we get robots to do everything?’ but not really asking ‘How do we get robots to do the things that they do know how to do—everywhere?’” says Mahi Shafiullah, a PhD student at New York University who worked on the project. “We looked at ‘How do you teach a robot to, say, open any door, anywhere?’”

Teaching robots new skills generally requires a lot of data, which is pretty hard to come by. Because robotic training data needs to be collected physically—a time-consuming and expensive undertaking—it’s much harder to build and scale training databases for robots than it is for types of AI like large language models, which are trained on information scraped from the internet.

To make it faster to gather the data essential for teaching a robot a new skill, the researchers developed a new version of a tool it had used in previous research: an iPhone attached to a cheap reacher-grabber stick, the kind typically used to pick up trash.

The team used the setup to record around 1,000 demonstrations in 40 different environments, including homes in New York City and Jersey City, for each of the five tasks—some of which had been gathered as part of previous research. Then they trained learning algorithms on the five data sets to create the five RUM models.

These models were deployed on Stretch, a robot consisting of a wheeled unit, a tall pole, and a retractable arm holding an iPhone, to test how successfully they were able to execute the tasks in new environments without additional tweaking. Although they achieved a completion rate of 74.4%, the researchers were able to increase this to a 90% success rate when they took images from the iPhone and the robot’s head-mounted camera, gave them to OpenAI’s recent GPT-4o LLM model, and asked it if the task had been completed successfully. If GPT-4o said no, they simply reset the robot and tried again.

A significant challenge facing roboticists is that training and testing their models in lab environments isn’t representative of what could happen in the real world, meaning research that helps machines to behave more reliably in new settings is much welcomed, says Mohit Shridhar, a research scientist specializing in robotic manipulation who wasn’t involved in the work.

“It’s nice to see that it’s being evaluated in all these diverse homes and kitchens, because if you can get a robot to work in the wild in a random house, that’s the true goal of robotics,” he says.

The project could serve as a general recipe to build other utility robotics models for other tasks, helping to teach robots new skills with minimal extra work and making it easier for people who aren’t trained roboticists to deploy future robots in their homes, says Shafiullah.

“The dream that we’re going for is that I could train something, put it on the internet, and you should be able to download and run it on a robot in your home,” he says.