Eric Schmidt: This is how AI will transform the way science gets done

It’s yet another summer of extreme weather, with unprecedented heat waves, wildfires, and floods battering countries around the world. In response to the challenge of accurately predicting such extremes, semiconductor giant Nvidia is building an AI-powered “digital twin” for the entire planet.

This digital twin, called Earth-2, will use predictions from FourCastNet, an AI model that uses tens of terabytes of Earth system data and can predict the next two weeks of weather tens of thousands of times faster and more accurately than current forecasting methods.

Usual weather prediction systems have the capacity to generate around 50 predictions for the week ahead. FourCastNet can instead predict thousands of possibilities, accurately capturing the risk of rare but deadly disasters and thereby giving vulnerable populations valuable time to prepare and evacuate.

The hoped-for revolution in climate modeling is just the beginning. With the advent of AI, science is about to become much more exciting—and in some ways unrecognizable. The reverberations of this shift will be felt far outside the lab; they will affect us all.

If we play our cards right, with sensible regulation and proper support for innovative uses of AI to address science’s most pressing issues, AI can rewrite the scientific process. We can build a future where AI-powered tools will both save us from mindless and time-consuming labor and also lead us to creative inventions and discoveries, encouraging breakthroughs that would otherwise take decades.

AI in recent months has become almost synonymous with large language models, or LLMs, but in science there are a multitude of different model architectures that may have even bigger impacts. In the past decade, most progress in science has come through smaller, “classical” models focused on specific questions. These models have already brought about profound advances. More recently, larger deep-learning models that are beginning to incorporate cross-domain knowledge and generative AI have expanded what is possible.

Scientists at McMaster and MIT, for example, used an AI model to identify an antibiotic to combat a pathogen that the World Health Organization labeled one of the world’s most dangerous antibiotic-resistant bacteria for hospital patients. A Google DeepMind model can control plasma in nuclear fusion reactions, bringing us closer to a clean-energy revolution. Within health care, the US Food and Drug Administration has already cleared 523 devices that use AI—75% of them for use in radiology.

Reimagining science

At its core, the scientific process we all learned in elementary school will remain the same: conduct background research, identify a hypothesis, test it through experimentation, analyze the collected data, and reach a conclusion. But AI has the potential to revolutionize how each of these components looks in the future.

Artificial intelligence is already transforming how some scientists conduct literature reviews. Tools like PaperQA and Elicit harness LLMs to scan databases of articles and produce succinct and accurate summaries of the existing literature—citations included.

Once the literature review is complete, scientists form a hypothesis to be tested. LLMs at their core work by predicting the next word in a sentence, building up to entire sentences and paragraphs. This technique makes LLMs uniquely suited to scaled problems intrinsic to science’s hierarchical structure and could enable them to predict the next big discovery in physics or biology.

AI can also spread the search net for hypotheses wider and narrow the net more quickly. As a result, AI tools can help formulate stronger hypotheses, such as models that spit out more promising candidates for new drugs. We’re already seeing simulations running multiple orders of magnitude faster than just a few years ago, allowing scientists to try more design options in simulation before carrying out real-world experiments.

Scientists at Caltech, for example, used an AI fluid simulation model to automatically design a better catheter that prevents bacteria from swimming upstream and causing infections. This kind of ability will fundamentally shift the incremental process of scientific discovery, allowing researchers to design for the optimal solution from the outset rather than progress through a long line of progressively better designs, as we saw in years of innovation on filaments in lightbulb design.

Moving on to the experimentation step, AI will be able to conduct experiments faster, cheaper, and at greater scale. For example, we can build AI-powered machines with hundreds of micropipettes running day and night to create samples at a rate no human could match. Instead of limiting themselves to just six experiments, scientists can use AI tools to run a thousand.

Scientists who are worried about their next grant, publication, or tenure process will no longer be bound to safe experiments with the highest odds of success; they will be free to pursue bolder and more interdisciplinary hypotheses. When evaluating new molecules, for example, researchers tend to stick to candidates similar in structure to those we already know, but AI models do not have to have the same biases and constraints.

Eventually, much of science will be conducted at “self-driving labs”—automated robotic platforms combined with artificial intelligence. Here, we can bring AI prowess from the digital realm into the physical world. Such self-driving labs are already emerging at companies like Emerald Cloud Lab and Artificial and even at Argonne National Laboratory.

Finally, at the stage of analysis and conclusion, self-driving labs will move beyond automation and, informed by experimental results they produced, use LLMs to interpret the results and recommend the next experiment to run. Then, as partners in the research process, the AI lab assistant could order supplies to replace those used in earlier experiments and set up and run the next recommended experiments overnight, with results ready to deliver in the morning—all while the experimenter is home sleeping.

Possibilities and limitations

Young researchers might be shifting nervously in their seats at the prospect. Luckily, the new jobs that emerge from this revolution are likely to be more creative and less mindless than most current lab work.

AI tools can lower the barrier to entry for new scientists and open up opportunities to those traditionally excluded from the field. With LLMs able to assist in building code, STEM students will no longer have to master obscure coding languages, opening the doors of the ivory tower to new, nontraditional talent and making it easier for scientists to engage with fields beyond their own. Soon, specifically trained LLMs might move beyond offering first drafts of written work like grant proposals and might be developed to offer “peer” reviews of new papers alongside human reviewers.

AI tools have incredible potential, but we must recognize where the human touch is still important and avoid running before we can walk. For example, successfully melding AI and robotics through self-driving labs will not be easy. There is a lot of tacit knowledge that scientists learn in labs that is difficult to pass to AI-powered robotics. Similarly, we should be cognizant of the limitations—and even hallucinations—of current LLMs before we offload much of our paperwork, research, and analysis to them.

Companies like OpenAI and DeepMind are still leading the way in new breakthroughs, models, and research papers, but the current dominance of industry won’t last forever. DeepMind has so far excelled by focusing on well-defined problems with clear objectives and metrics. One of its most famous successes came at the Critical Assessment of Structure Prediction, a biennial competition where research teams predict a protein’s exact shape from the order of its amino acids.

From 2006 to 2016, the average score in the hardest category ranged from around 30 to 40 on CASP’s scale of 1 to 100. Suddenly, in 2018, DeepMind’s AlphaFold model scored a whopping 58. An updated version called AlphaFold2 scored 87 two years later, leaving its human competitors even further in the dust.

Thanks to open-source resources, we’re beginning to see a pattern where industry hits certain benchmarks and then academia steps in to refine the model. After DeepMind’s release of AlphaFold, Minkyung Baek and David Baker at the University of Washington released RoseTTAFold, which uses DeepMind’s framework to predict the structures of protein complexes instead of only the single protein structures that AlphaFold could originally handle. More important, academics are more shielded from the competitive pressures of the market, so they can venture beyond the well-defined problems and measurable successes that attract DeepMind.

In addition to reaching new heights, AI can help verify what we already know by addressing science’s replicability crisis. Around 70% of scientists report having been unable to reproduce another scientist’s experiment—a disheartening figure. As AI lowers the cost and effort of running experiments, it will in some cases be easier to replicate results or conclude that they can’t be replicated, contributing to a greater trust in science.

The key to replicability and trust is transparency. In an ideal world, everything in science would be open access, from articles without paywalls to open-source data, code, and models. Sadly, with the dangers that such models are able to unleash, it isn’t always realistic to make all models open source. In many cases, the risks of being completely transparent outweigh the benefits of trust and equity. Nevertheless, to the extent that we can be transparent with models—especially classical AI models with more limited uses—we should be.

The importance of regulation

With all these areas, it’s essential to remember the inherent limitations and risks of artificial intelligence. AI is such a powerful tool because it allows humans to accomplish more with less: less time, less education, less equipment. But these capabilities make it a dangerous weapon in the wrong hands. Andrew White, a professor at the University of Rochester, was contracted by OpenAI to participate in a “red team” that could expose GPT-4’s risks before it was released. Using the language model and giving it access to tools, White found it could propose dangerous compounds and even order them from a chemical supplier. To test the process, he had a (safe) test compound shipped to his house the next week. OpenAI says it used his findings to tweak GPT-4 before it was released.

Even humans with entirely good intentions can still prompt AIs to produce bad outcomes. We should worry less about creating the Terminator and, as computer scientist Stuart Russell has put it, more about becoming King Midas, who wished for everything he touched to turn to gold and thereby accidentally killed his daughter with a hug.

We have no mechanism to prompt an AI to change its goal, even when it reacts to its goal in a way we don’t anticipate. One oft-cited hypothetical asks you to imagine telling an AI to produce as many paper clips as possible. Determined to accomplish its goal, the model hijacks the electrical grid and kills any human who tries to stop it as the paper clips keep piling up. The world is left in shambles. The AI pats itself on the back; it has done its job. (In a wink to this famous thought experiment, many OpenAI employees carry around branded paper clips.)

OpenAI has managed to implement an impressive array of safeguards, but these will only remain in place as long as GPT-4 is housed on OpenAI’s servers. The day will likely soon come when someone manages to copy the model and house it on their own servers. Such frontier models need to be protected to prevent thieves from removing the AI safety guardrails so carefully added by their original developers.

To address both intentional and unintentional bad uses of AI, we need smart, well-informed regulation—on both tech giants and open-source models—that doesn’t keep us from using AI in ways that can be beneficial to science. Although tech companies have made strides in AI safety, government regulators are currently woefully underprepared to enact proper laws and should take greater steps to educate themselves on the latest developments.

Beyond regulation, governments—along with philanthropy—can support scientific projects with a high social return but little financial return or academic incentive. Several areas are especially urgent, including climate change, biosecurity, and pandemic preparedness. It is in these areas where we most need the speed and scale that AI simulations and self-driving labs offer.

Government can also help develop large, high-quality data sets such as those on which AlphaFold relied—insofar as safety concerns allow. Open data sets are public goods: they benefit many researchers, but researchers have little incentive to create them themselves. Government and philanthropic organizations can work with universities and companies to pinpoint seminal challenges in science that would benefit from access to powerful databases.

Chemistry, for example, has one language that unites the field, which would seem to lend itself to easy analysis by AI models. But no one has properly aggregated data on molecular properties stored across dozens of databases, which keeps us from accessing insights into the field that would be within reach of AI models if we had a single source. Biology, meanwhile, lacks the known and calculable data that underlies physics or chemistry, with subfields like intrinsically disordered proteins that are still mysterious to us. It will therefore require a more concerted effort to understand—and even record—the data for an aggregated database.

The road ahead to broad AI adoption in the sciences is long, with a lot that we must get right, from building the right databases to implementing the right regulations, mitigating biases in AI algorithms to ensuring equal access to computing resources across borders.

Nevertheless, this is a profoundly optimistic moment. Previous paradigm shifts in science, like the emergence of the scientific process or big data, have been inwardly focused—making science more precise, accurate, and methodical. AI, meanwhile, is expansive, allowing us to combine information in novel ways and bring creativity and progress in the sciences to new heights.

Eric Schmidt was the CEO of Google from 2001 to 2011. He is currently cofounder of Schmidt Futures, a philanthropic initiative that bets early on exceptional people making the world better, applying science and technology, and bringing people together across fields.