Ice Lounge Media

Ice Lounge Media

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Google DeepMind has a new way to look inside an AI’s “mind”

We don’t know exactly how AI works, or why it works so well. That’s a problem: It could lead us to deploy an AI system in a highly sensitive field like medicine without understanding that it could have critical flaws embedded in its workings.

A team at Google DeepMind that studies something called mechanistic interpretability has been working on new ways to let us peer under the hood. It recently released a tool to help researchers understand what is happening when AI is generating an output. 

It’s all part of a push to get a better understanding of exactly what is happening inside an AI model. If we do, we’ll be able to control its outputs more effectively, leading to better AI systems in the future. Read the full story.

—Scott J Mulligan

What’s on the table at this year’s UN climate conference

Talks kicked off this week at COP29 in Baku, Azerbaijan. Running for a couple of weeks each year, the global summit is the largest annual meeting on climate change.

The issue on the table this time around: Countries need to agree to set a new goal on how much money should go to developing countries to help them finance the fight against climate change. Complicating things? A US president-elect whose approach to climate is very different from that of the current administration (understatement of the century).

This is a big moment that could set the tone for what the next few years of the international climate world looks like. Here’s what you need to know about COP29 and how Donald Trump’s election is coloring things.

—Casey Crownhart

This story is from The Spark, our weekly newsletter giving you the inside track on all things energy and climate. Sign up to receive it in your inbox every Wednesday.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The FBI is investigating crypto predictions-betting platform Polymarket 
It’s investigating whether the firm allowed US traders to bet on the election. (Bloomberg $)
+ Doing so would have been a violation of an agreement with the US government. (NYT $)
+ Polymarket claims to be a “fully transparent prediction market.” (WSJ $)

2 OpenAI is calling for the US government to invest in AI
Without financial support, the US could lose crucial ground to China, it warns. (WP $)
+ The firm floated the idea of building a colossal data center. (The Information $)

3 AI-generated Elon Musk propaganda is rife on Facebook
Pro-Musk inspiration porn is the content of choice for spammers. (404 Media)
+ Trump is surrounding himself with terminally online edgelords. (The Atlantic $)

4 The online right has a misogynistic new rallying cry
‘Your body, my choice’ is being spread by young men seeking to provoke. (New Yorker $)+ The upcoming presidency could usher in an age of gendered regression. (The Guardian)

5 China’s human factory workers are under pressure
Robots are creeping into every level of the manufacturing process. (FT $)
+ Three reasons robots are about to become way more useful. (MIT Technology Review)

6 The future of chipmaking in America
Efforts to revitalize native facilities aren’t exactly going to plan. (Wired $)
+ What’s next in chips. (MIT Technology Review)

7 Blindbox live streaming is thrilling shoppers in China
You never know what you’re going to get. (NYT $)

8 What the glacial Earth may have looked like
Around 700 million years ago, the entire planet was covered in ice. (Ars Technica)
+ Life-seeking, ice-melting robots could punch through Europa’s icy shell. (MIT Technology Review)

9 How to protect the world’s largest single coral colony
The newly-discovered colony is the size of two basketball courts. (Vox)
+ The race is on to save coral reefs—by freezing them. (MIT Technology Review)

10 These researchers have reinvented the wheel
This ‘morphing’ wheel can roll over obstacles up to 1.3 times the height of its radius. (Reuters)

Quote of the day

“Shawty crunk, so fresh, so clean.”

—Mark Zuckerberg, Meta CEO-turned rapper, debuts a reworked version of 2002 rap hit Get Low in a tribute to his wife, the Wall Street Journal reports.

The big story

Marseille’s battle against the surveillance state

June 2022

Across the world, video cameras have become an accepted feature of urban life. Many cities in China now have dense networks of them, and London and New Delhi aren’t far behind. Now France is playing catch-up.

Concerns have been raised throughout the country. But the surveillance rollout has met special resistance in Marseille, France’s second-biggest city.

It’s unsurprising, perhaps, that activists are fighting back against the cameras, highlighting the surveillance system’s overreach and underperformance. But are they succeeding? Read the full story.

—Fleur Macdonald

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ This year’s gurning championship winning mugshots do not disappoint.
+ What does it mean to have personal style, exactly?
+ Amsterdam’s unofficial police cat is absolutely adorable (and he lives on a boat!)
+ Save the worms—this writer certainly is. 🪱

Read more

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

It’s time for a party—the Conference of the Parties, that is. Talks kicked off this week at COP29 in Baku, Azerbaijan. Running for a couple of weeks each year, the global summit is the largest annual meeting on climate change.

The issue on the table this time around: Countries need to agree to set a new goal on how much money should go to developing countries to help them finance the fight against climate change. Complicating things? A US president-elect whose approach to climate is very different from that of the current administration (understatement of the century).

This is a big moment that could set the tone for what the next few years of the international climate world looks like. Here’s what you need to know about COP29 and how Donald Trump’s election is coloring things.

The UN COP meetings are an annual chance for nearly 200 nations to get together to discuss (and hopefully act on) climate change. Greatest hits from the talks include the Paris Agreement, a 2015 global accord that set a goal to limit global warming to 1.5 °C (2.7 °F) above preindustrial levels.

This year, the talks are in Azerbaijan, a petrostate if there ever was one. Oil and gas production makes up over 90% of the country’s export revenue and nearly half its GDP as of 2022. A perfectly ironic spot for a global climate summit!

The biggest discussion this year centers on global climate finance—specifically, how much of it is needed to help developing countries address climate change and adapt to changing conditions. The current goal, set in 2009, is for industrialized countries to provide $100 billion each year to developing nations. The deadline was 2020, and that target was actually met for the first time in 2022, according to the Organization for Economic Cooperation and Development, which keeps track of total finance via reports from contributing countries. Currently, most of that funding is in the form of public loans and grants.

The thing is, that $100 billion number was somewhat arbitrary—in Paris in 2015, countries agreed that a new, larger target should be set in 2025 to take into account how much countries actually need.

It’s looking as if the magic number is somewhere around $1 trillion each year. However, it remains to be seen how this goal will end up shaking out, because there are disagreements about basically every part of this. What should the final number be? What kind of money should count—just public funds, or private investments as well? Which nations should pay? How long will this target stand? What, exactly, would this money be going toward?

Working out all those details is why nations are gathering right now. But one shadow looming over these negotiations is the impending return of Donald Trump.

As I covered last week, Trump’s election will almost certainly result in less progress on cutting emissions than we might have seen under a more climate-focused administration. But arguably an even bigger deal than domestic progress (or lack thereof) will be how Trump shifts the country’s climate position on the international stage.

The US has emitted more carbon pollution into the atmosphere than any other country, it currently leads the world in per capita emissions, and it’s the world’s richest economy. If anybody should be a leader at the table in talks about climate finance, it’s the US. And yet, Trump is coming into power soon, and we’ve all seen this film before. 

Last time Trump was in office, he pulled the US out of the Paris Agreement. He’s made promises to do it again—and could go one step further by backing out of the UN Framework Convention on Climate Change (UNFCCC) altogether. If leaving the Paris Agreement is walking away from the table, withdrawing from the UNFCCC is like hopping on a rocket and blasting in a different direction. It’s a more drastic action and could be tougher to reverse in the future, though experts also aren’t sure if Trump could technically do this on his own.

The uncertainty of what happens next in the US is a cloud hanging over these negotiations. “This is going to be harder because we don’t have a dynamic and pushy and confident US helping us on climate action,” said Camilla Born, an independent climate advisor and former UK senior official at COP26, during an online event last week hosted by Carbon Brief.

Some experts are confident that others will step up to fill the gap. “There are many drivers of climate action beyond the White House,” said Mohamed Adow, founding director of Power Shift Africa, at the CarbonBrief event.

If I could characterize the current vibe in the climate world, it’s uncertainty. But the negotiations over the next couple of weeks could provide clues to what we can expect for the next few years. Just how much will a Trump presidency slow global climate action? Will the European Union step up? Could this cement the rise of China as a climate leader? We’ll be watching it all.


Now read the rest of The Spark

Related reading

In case you want some additional context from the last few years of these meetings, here’s my coverage of last year’s fight at COP28 over a transition away from fossil fuels, and a newsletter about negotiations over the “loss and damages” fund at COP27.

For the nitty-gritty details about what’s on the table at COP29, check out this very thorough explainer from Carbon Brief.

The White House in Washington DC under dark stormy clouds

DAN THORNBERG/ADOBE STOCK

Another thing

Trump’s election will have significant ripple effects across the economy and our lives. His victory is a tragic loss for climate progress, as my colleague James Temple wrote in an op-ed last week. Give it a read, if you haven’t already, to dig into some of the potential impacts we might see over the next four years and beyond. 

Keeping up with climate  

The US Environmental Protection Agency finalized a rule to fine oil and gas companies for methane emissions. The fee was part of the Inflation Reduction Act of 2022. (Associated Press)
→ This rule faces a cloudy future under the Trump administration; industry groups are already talking about repealing it. (NPR)

Speaking of the EPA, Donald Trump chose Lee Zeldin, a former Republican congressman from New York, to lead the agency. Zeldin isn’t particularly known for climate or economic policy. (New York Times)

Oil giant BP is scaling back its early-stage hydrogen projects. The company revealed in an earnings report that it’s canceling 18 such projects and currently plans to greenlight between five and 10. (TechCrunch)

Investors betting against renewable energy scored big last week, earning nearly $1.2 billion as stocks in that sector tumbled. (Financial Times)

Lithium iron phosphate batteries are taking over the world, or at least electric vehicles. These lithium-ion batteries are cheaper and longer-lasting than their nickel-containing cousins, though they also tend to be heavier. (Canary Media
→ I wrote about this trend last year in a newsletter about batteries and their ingredients. (MIT Technology Review)

The US unveiled plans to triple its nuclear energy capacity by 2050. That’s an additional 200 gigawatts’ worth of consistently available power. (Bloomberg)

Five subsea cables that can help power millions of homes just got the green light in Great Britain. The projects will help connect the island to other power grids, as well as to offshore wind farms in Dutch and Belgian waters. (The Guardian)

Read more

AI has led to breakthroughs in drug discovery and robotics and is in the process of entirely revolutionizing how we interact with machines and the web. The only problem is we don’t know exactly how it works, or why it works so well. We have a fair idea, but the details are too complex to unpick. That’s a problem: It could lead us to deploy an AI system in a highly sensitive field like medicine without understanding that it could have critical flaws embedded in its workings.

A team at Google DeepMind that studies something called mechanistic interpretability has been working on new ways to let us peer under the hood. At the end of July, it released Gemma Scope, a tool to help researchers understand what is happening when AI is generating an output. The hope is that if we have a better understanding of what is happening inside an AI model, we’ll be able to control its outputs more effectively, leading to better AI systems in the future.

“I want to be able to look inside a model and see if it’s being deceptive,” says Neel Nanda, who runs the mechanistic interpretability team at Google DeepMind. “It seems like being able to read a model’s mind should help.”

Mechanistic interpretability, also known as “mech interp,” is a new research field that aims to understand how neural networks actually work. At the moment, very basically, we put inputs into a model in the form of a lot of data, and then we get a bunch of model weights at the end of training. These are the parameters that determine how a model makes decisions. We have some idea of what’s happening between the inputs and the model weights: Essentially, the AI is finding patterns in the data and making conclusions from those patterns, but these patterns can be incredibly complex and often very hard for humans to interpret.

It’s like a teacher reviewing the answers to a complex math problem on a test. The student—the AI, in this case—wrote down the correct answer, but the work looks like a bunch of squiggly lines. This example assumes the AI is always getting the correct answer, but that’s not always true; the AI student may have found an irrelevant pattern that it’s assuming is valid. For example, some current AI systems will give you the result that 9.11 is bigger than 9.8. Different methods developed in the field of mechanistic interpretability are beginning to shed a little bit of light on what may be happening, essentially making sense of the squiggly lines.

“A key goal of mechanistic interpretability is trying to reverse-engineer the algorithms inside these systems,” says Nanda. “We give the model a prompt, like ‘Write a poem,’ and then it writes some rhyming lines. What is the algorithm by which it did this? We’d love to understand it.”

To find features—or categories of data that represent a larger concept—in its AI model, Gemma, DeepMind ran a tool known as a “sparse autoencoder” on each of its layers. You can think of a sparse autoencoder as a microscope that zooms in on those layers and lets you look at their details. For example, if you prompt Gemma about a chihuahua, it will trigger the “dogs” feature, lighting up what the model knows about “dogs.” The reason it is considered “sparse” is that it’s limiting the number of neurons used, basically pushing for a more efficient and generalized representation of the data.

The tricky part of sparse autoencoders is deciding how granular you want to get. Think again about the microscope. You can magnify something to an extreme degree, but it may make what you’re looking at impossible for a human to interpret. But if you zoom too far out, you may be limiting what interesting things you can see and discover. 

DeepMind’s solution was to run sparse autoencoders of different sizes, varying the number of features they want the autoencoder to find. The goal was not for DeepMind’s researchers to thoroughly analyze the results on their own. Gemma and the autoencoders are open-source, so this project was aimed more at spurring interested researchers to look at what the sparse autoencoders found and hopefully make new insights into the model’s internal logic. Since DeepMind ran autoencoders on each layer of their model, a researcher could map the progression from input to output to a degree we haven’t seen before.

“This is really exciting for interpretability researchers,” says Josh Batson, a researcher at Anthropic. “If you have this model that you’ve open-sourced for people to study, it means that a bunch of interpretability research can now be done on the back of those sparse autoencoders. It lowers the barrier to entry to people learning from these methods.”

Neuronpedia, a platform for mechanistic interpretability, partnered with DeepMind in July to build a demo of Gemma Scope that you can play around with right now. In the demo, you can test out different prompts and see how the model breaks up your prompt and what activations your prompt lights up. You can also mess around with the model. For example, if you turn the feature about dogs way up and then ask the model a question about US presidents, Gemma will find some way to weave in random babble about dogs, or the model may just start barking at you.

One interesting thing about sparse autoencoders is that they are unsupervised, meaning they find features on their own. That leads to surprising discoveries about how the models break down human concepts. “My personal favorite feature is the cringe feature,” says Joseph Bloom, science lead at Neuronpedia. “It seems to appear in negative criticism of text and movies. It’s just a great example of tracking things that are so human on some level.” 

You can search for concepts on Neuronpedia and it will highlight what features are being activated on specific tokens, or words, and how strongly each one is activated. “If you read the text and you see what’s highlighted in green, that’s when the model thinks the cringe concept is most relevant. The most active example for cringe is somebody preaching at someone else,” says Bloom.

Some features are proving easier to track than others. “One of the most important features that you would want to find for a model is deception,” says Johnny Lin, founder of Neuronpedia. “It’s not super easy to find: ‘Oh, there’s the feature that fires when it’s lying to us.’ From what I’ve seen, it hasn’t been the case that we can find deception and ban it.”

DeepMind’s research is similar to what another AI company, Anthropic, did back in May with Golden Gate Claude. It used sparse autoencoders to find the parts of Claude, their model, that lit up when discussing the Golden Gate Bridge in San Francisco. It then amplified the activations related to the bridge to the point where Claude literally identified not as Claude, an AI model, but as the physical Golden Gate Bridge and would respond to prompts as the bridge.

Although it may just seem quirky, mechanistic interpretability research may prove incredibly useful. “As a tool for understanding how the model generalizes and what level of abstraction it’s working at, these features are really helpful,” says Batson.

For example, a team lead by Samuel Marks, now at Anthropic, used sparse autoencoders to find features that showed a particular model was associating certain professions with a specific gender. They then turned off these gender features to reduce bias in the model. This experiment was done on a very small model, so it’s unclear if the work will apply to a much larger model.

Mechanistic interpretability research can also give us insights into why AI makes errors. In the case of the assertion that 9.11 is larger than 9.8, researchers from Transluce saw that the question was triggering the parts of an AI model related to Bible verses and September 11. The researchers concluded the AI could be interpreting the numbers as dates, asserting the later date, 9/11, as greater than 9/8. And in a lot of books like religious texts, section 9.11 comes after section 9.8, which may be why the AI thinks of it as greater. Once they knew why the AI made this error, the researchers tuned down the AI’s activations on Bible verses and September 11, which led to the model giving the correct answer when prompted again on whether 9.11 is larger than 9.8.

There are also other potential applications. Currently, a system-level prompt is built into LLMs to deal with situations like users who ask how to build a bomb. When you ask ChatGPT a question, the model is first secretly prompted by OpenAI to refrain from telling you how to make bombs or do other nefarious things. But it’s easy for users to jailbreak AI models with clever prompts, bypassing any restrictions. 

If the creators of the models are able to see where in an AI the bomb-building knowledge is, they can theoretically turn off those nodes permanently. Then even the most cleverly written prompt wouldn’t elicit an answer about how to build a bomb, because the AI would literally have no information about how to build a bomb in its system.

This type of granularity and precise control are easy to imagine but extremely hard to achieve with the current state of mechanistic interpretability. 

“A limitation is the steering [influencing a model by adjusting its parameters] is just not working that well, and so when you steer to reduce violence in a model, it ends up completely lobotomizing its knowledge in martial arts. There’s a lot of refinement to be done in steering,” says Lin. The knowledge of “bomb making,” for example, isn’t just a simple on-and-off switch in an AI model. It most likely is woven into multiple parts of the model, and turning it off would probably involve hampering the AI’s knowledge of chemistry. Any tinkering may have benefits but also significant trade-offs.

That said, if we are able to dig deeper and peer more clearly into the “mind” of AI, DeepMind and others are hopeful that mechanistic interpretability could represent a plausible path to alignment—the process of making sure AI is actually doing what we want it to do.

Read more
1 53 54 55 56 57 2,517