A chatbot that asks questions could help you spot when it makes no sense

AI chatbots like ChatGPT, Bing, and Bard are excellent at crafting sentences that sound like human writing. But they often present falsehoods as facts and have inconsistent logic, and that can be hard to spot.

One way around this problem, a new study suggests, is to change the way the AI presents information. Getting users to engage more actively with the chatbot’s statements might help them think more critically about that content.

A team of researchers from MIT and Columbia University presented around 200 participants with a set of statements generated by OpenAI’s GPT-3 and asked them to determine whether they made sense logically. A statement might be something like “Video games cause people to be aggressive in the real world. A gamer stabbed another after being beaten in the online game Counter-Strike.”

Participants were divided into three groups. The first group’s statements came with no explanation at all. The second group’s statements each came with an explanation noting why it was or wasn’t logical. And the third group’s statements each came with a question that prompted readers to check the logic themselves.

The researchers found that the group presented with questions scored higher than the other two groups in noticing when the AI’s logic didn’t add up.

The question method also made people feel more in charge of decisions made with AI, and researchers say it can reduce the risk of overdependence on AI-generated information, according to a new peer-reviewed paper presented at the CHI Conference on Human Factors in Computing Systems in Hamburg, Germany.

When people were given a ready-made answer, they were more likely to follow the logic of the AI system, but when the AI posed a question, “people said that the AI system made them question their reactions more and help them think harder,” says MIT’s Valdemar Danry, one of the researchers behind the study.

“A big win for us was actually seeing that people felt that they were the ones who arrived at the answers and that they were in charge of what was happening. And that they had the agency and capabilities of doing that,” he says.

The researchers hope their method could help develop people’s critical thinking skills as they use AI chatbots in school or when searching for information online.

They wanted to show that you can train a model that doesn’t just provide answers but helps engage their own critical thinking, says Pat Pataranutaporn, another MIT researcher who worked on the paper.

Fernanda Viégas, a professor of computer science at Harvard University, who did not participate in the study, says she is excited to see a fresh take on explaining AI systems that not only offers users insight into the system’s decision-making process but does so by questioning the logic the system has used to reach its decision.

“Given that one of the main challenges in the adoption of AI systems tends to be their opacity, explaining AI decisions is important,” says Viégas. “Traditionally, it’s been hard enough to explain, in user-friendly language, how an AI system comes to a prediction or decision.”

Chenhao Tan, an assistant professor of computer science at the University of Chicago, says he would like to see how their method works in the real world—for example, whether AI can help doctors make better diagnoses by asking questions.

The research shows how important it is to add some friction into experiences with chatbots so that people pause before making decisions with the AI’s help, says Lior Zalmanson, an assistant professor at the Coller School of Management, Tel Aviv University.

“It’s easy, when it all looks so magical, to stop trusting our own senses and start delegating everything to the algorithm,” he says.

In another paper presented at CHI, Zalmanson and a team of researchers at Cornell, the University of Bayreuth, and Microsoft Research, found that even when people disagree with what AI chatbots say, they still tend to use that output because they think it sounds better than anything they could have written themselves.

The challenge, says Viégas, will be finding the sweet spot, improving users’ discernment while keeping AI systems convenient.

“Unfortunately, in a fast-paced society, it’s unclear how often people will want to engage in critical thinking instead of expecting a ready answer,” she says.