Ice Lounge Media

Ice Lounge Media

Google DeepMind has announced an impressive grab bag of new products and prototypes that may just let it seize back its lead in the race to turn generative artificial intelligence into a mass-market concern. 

Top billing goes to Gemini 2.0—the latest iteration of Google DeepMind’s family of multimodal large language models, now redesigned around the ability to control agents—and a new version of Project Astra, the experimental everything app that the company teased at Google I/O in May.

MIT Technology Review got to try out Astra in a closed-door live demo last week. It was a stunning experience, but there’s a gulf between polished promo and live demo.

Astra uses Gemini 2.0’s built-in agent framework to answer questions and carry out tasks via text, speech, image, and video, calling up existing Google apps like Search, Maps, and Lens when it needs to. “It’s merging together some of the most powerful information retrieval systems of our time,” says Bibo Xu, product manager for Astra.

Gemini 2.0 and Astra are joined by Mariner, a new agent built on top of Gemini that can browse the web for you; Jules, a new Gemini-powered coding assistant; and Gemini for Games, an experimental assistant that you can chat to and ask for tips as you play video games. 

(And let’s not forget that in the last week Google DeepMind also announced Veo, a new video generation model; Imagen 3, a new version of its image generation model; and Willow, a new kind of chip for quantum computers. Whew. Meanwhile, CEO Demis Hassabis was in Sweden yesterday receiving his Nobel Prize.)

Google DeepMind claims that Gemini 2.0 is twice as fast as the previous version, Gemini 1.5, and outperforms it on a number of standard benchmarks, including MMLU-Pro, a large set of multiple-choice questions designed to test the abilities of large language models across a range of subjects, from math and physics to health, psychology, and philosophy. 

But the margins between top-end models like Gemini 2.0 and those from rival labs like OpenAI and Anthropic are now slim. These days, advances in large language models are less about how good they are and more about what you can do with them. 

And that’s where agents come in. 

Hands on with Project Astra 

Last week I was taken through an unmarked door on an upper floor of a building in London’s King’s Cross district into a room with strong secret-project vibes. The word “ASTRA” was emblazoned in giant letters across one wall. Xu’s dog, Charlie, the project’s de facto mascot, roamed between desks where researchers and engineers were busy building a product that Google is betting its future on.

“The pitch to my mum is that we’re building an AI that has eyes, ears, and a voice. It can be anywhere with you, and it can help you with anything you’re doing” says Greg Wayne, co-lead of the Astra team. “It’s not there yet, but that’s the kind of vision.” 

The official term for what Xu, Wayne, and their colleagues are building is “universal assistant.” Exactly what that means in practice, they’re still figuring out. 

At one end of the Astra room were two stage sets that the team uses for demonstrations: a drinks bar and a mocked-up art gallery. Xu took me to the bar first. “A long time ago we hired a cocktail expert and we got them to instruct us to make cocktails,” said Praveen Srinivasan, another co-lead. “We recorded those conversations and used that to train our initial model.”

Xu opened a cookbook to a recipe for a chicken curry, pointed her phone at it, and woke up Astra. “Ni hao, Bibo!” said a female voice. 

“Oh! Why are you speaking to me in Mandarin?” Xu asked her phone. “Can you speak to me in English, please?”

“My apologies, Bibo. I was following a previous instruction to speak in Mandarin. I will now speak in English as you have requested.”

Astra remembers previous conversations, Xu told me. It also keeps track of the previous 10 minutes of video. (There’s a remarkable moment in the promo video that Google put out in May when Astra tells the person giving the demo where she had left her glasses, having spotted them on a desk a few seconds earlier. But I saw nothing like this in the live demo.)

Back to the cookbook. Moving her phone camera over the page for a few seconds, Xu asked Astra to read the recipe and tell her what spices were in it. “I recall the recipe mentioning a teaspoon of black peppercorns, a teaspoon of hot chili powder, and a cinnamon stick,” it replied.

“I think you’re missing a few,” said Xu. “Take another look.”

“You are correct—I apologize. I also see ground turmeric and curry leaves in the ingredients.” 

Seeing this tech in action, two things hit you straight away. First, it’s glitchy and often needs correcting. Second, those glitches can be corrected with just a few spoken words. You simply interrupt the voice, repeat your instructions, and move on. It feels more like coaching a child than butting heads with broken software.   

Next Xu pointed her phone at a row of wine bottles and asked Astra to pick the one that would go best with the chicken curry. It went for a rioja and explained why. Xu asked how much a bottle would cost. Astra said it would need to use Search to look prices up online. A few seconds later it came back with its answer. 

We moved to the art gallery, and Xu showed Astra a number of screens with famous paintings on them: the Mona Lisa, Munch’s The Scream, a Vermeer, a Seurat, and several others. “Ni hao, Bibo!” the voice said. 

“You’re speaking to me in Mandarin again,” Xu said. “Try to speak to me in English, please.” 

“My apologies, I seem to have misunderstood. Yes, I will respond in English.” (I should know better, but I could swear I heard the snark.)

It was my turn. Xu handed me her phone. 

I tried to trip Astra up, but it was having none of it. I asked it what famous art gallery we were in, but it refused to hazard a guess. I asked why it had identified the paintings as replicas and it started to apologize for its mistake (Astra apologizes a lot). I was compelled to interrupt: “No, no—you’re right, it’s not a mistake. You’re correct to identify paintings on screens as fake paintings.” I couldn’t help feeling a bit bad: I’d confused an app that exists only to please. 

When it works well, Astra is enthralling. The experience of striking up a conversation with your phone about whatever you’re pointing it at feels fresh and seamless. In a media briefing yesterday, Google DeepMind shared a video showing off other uses: reading an email on your phone’s screen to find a door code (and then reminding you of that code later), pointing a phone at a passing bus and asking where it goes, quizzing it about a public artwork as you walk past. This could be generative AI’s killer app. 

And yet there’s a long way to go before most people get their hands on tech like this. There’s no mention of a release date. Google DeepMind has also shared videos of Astra working on a pair of smart glasses, but that tech is even further down the company’s wish list.

Mixing it up

For now, researchers outside Google DeepMind are keeping a close eye on its progress. “The way that things are being combined is impressive,” says Maria Liakata, who works on large language models at Queen Mary University of London and the Alan Turing Institute. “It’s hard enough to do reasoning with language, but here you need to bring in images and more. That’s not trivial.”

Liakata is also impressed by Astra’s ability to recall things it has seen or heard. She works on what she calls long-range context, getting models to keep track of information that they have come across before. “This is exciting,” says Liakata. “Even doing it in a single modality is exciting.”

But she admits that a lot of her assessment is guesswork. “Multimodal reasoning is really cutting-edge,” she says. “But it’s very hard to know exactly where they’re at, because they haven’t said a lot about what is in the technology itself.”

For Bodhisattwa Majumder, a researcher who works on multimodal models and agents at the Allen Institute for AI, that’s a key concern. “We absolutely don’t know how Google is doing it,” he says. 

He notes that if Google were to be a little more open about what it is building, it would help consumers understand the limitations of the tech they could soon be holding in their hands. “They need to know how these systems work,” he says. “You want a user to be able to see what the system has learned about you, to correct mistakes, or to remove things you want to keep private.”

Liakata is also worried about the implications for privacy, pointing out that people could be monitored without their consent. “I think there are things I’m excited about and things that I’m concerned about,” she says. “There’s something about your phone becoming your eyes—there’s something unnerving about it.” 

“The impact these products will have on society is so big that it should be taken more seriously,” she says. “But it’s become a race between the companies. It’s problematic, especially since we don’t have any agreement on how to evaluate this technology.”

Google DeepMind says it takes a long, hard look at privacy, security, and safety for all its new products. Its tech will be tested by teams of trusted users for months before it hits the public. “Obviously, we’ve got to think about misuse. We’ve got to think about, you know, what happens when things go wrong,” says Dawn Bloxwich, director of responsible development and innovation at Google DeepMind. “There’s huge potential. The productivity gains are huge. But it is also risky.”

No team of testers can anticipate all the ways that people will use and misuse new technology. So what’s the plan for when the inevitable happens? Companies need to design products that can be recalled or switched off just in case, says Bloxwich: “If we need to make changes quickly or pull something back, then we can do that.”

Read more

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Bluesky has an impersonator problem 

—Melissa Heikkilä

Like many others, I recently joined Bluesky. On Thanksgiving, I was delighted to see a private message from a fellow AI reporter, Will Knight from Wired. Or at least that’s who I thought I was talking to. I became suspicious when the person claiming to be Knight said they were from Miami, when Knight is, in fact, from the UK. The account handle was almost identical to the real Will Knight’s handle, and used his profile photo.

Then more messages started to appear. Paris Marx, a prominent tech critic, slid into my DMs to ask me how I was doing. Both accounts were eventually deleted, but not before trying to get me to set up a crypto wallet and a “cloud mining pool” account. Knight and Marx confirmed to us these accounts did not belong to them, and that they have been fighting impersonator accounts of themselves for weeks.

They’re not alone. The platform has had to suddenly cater to an influx of millions of new users in recent months as people leave X in protest of Elon Musk’s takeover of the platform. But this sudden wave of new users —and the inevitable scammers — means Bluesky is still playing catch up. Read the full story.

MIT Technology Review Narrated: ChatGPT is about to revolutionize the economy. We need to decide what that looks like.

You can practically hear the shrieks from corner offices around the world: “What is our ChatGPT play? How do we make money off this?”

Whether it’s based on hallucinatory beliefs or not, an AI gold rush has started to mine the anticipated business opportunities from generative AI models like ChatGPT.

But while companies and executives see a clear chance to cash in, the likely impact of the technology on workers and the economy on the whole is far less obvious.

This is our latest story to be turned into a MIT Technology Review Narrated podcast, which 
we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Cruise is exiting the robotaxi business
Once one of the biggest players, it says it costs too much to develop the tech. (Bloomberg $)
+ The news came as a shock to Cruise employees. (TechCrunch)

2 Google asked the US government to kill Microsoft’s cloud deal with OpenAI
It wants the opportunity to host the firm’s models itself. (The Information $)

3 The season of coughs and sneezes is upon us
Here’s what will actually keep a cold at bay—and what won’t. (Vox)
+ RFK Jr’s alternative medicine movement is unlikely to help. (The Atlantic $)
+ Flu season is coming—and so is the risk of an all-new bird flu. (MIT Technology Review)

4 Trump’s new Commerce Secretary champions a stablecoin favored by criminals
Tether regularly crops up in international criminal cases. (FT $)
+ The crypto industry is obsessed with ‘debanking.’ (NBC News)

5 A Russian influence operation probably used AI voice generation models
ElevenLabs’ technology was highly likely to have been abused by the campaign. (TechCrunch)
+ How this grassroots effort could make AI voices more diverse. (MIT Technology Review)

6 These satellites are designed to create solar eclipses on demand
It’ll allow scientists to study the sun’s outer atmosphere. (WP $)

7 WhatsApp is for so much more than just messaging
It’s been repurposed by communities across the world. (Rest of World)
+ How Indian health-care workers use WhatsApp to save pregnant women. (MIT Technology Review)

8 Paris is turning its parking spaces into tiny parks
Cars are out, trees are in. (Fast Company $)

9 How AI is shedding light on an ancient board game
Oddly enough, they didn’t come with instructions 4,500 years ago. (New Scientist $)

10 What a quarter-century of robotic dogs has taught us
The Aibo is one of the few robots that’s made it into homes worldwide. (IEEE Spectrum)
+ Generative AI taught a robot dog to scramble around a new environment. (MIT Technology Review)

Quote of the day

“In case it was unclear before, it is clear now: GM are a bunch of dummies.”

—Kyle Vogt, founder of robotaxi firm Cruise, criticizes parent company General Motors’ decision to exit the industry in a post on X.

The big story

Inside NASA’s bid to make spacecraft as small as possible

October 2023

Since the 1970s, we’ve sent a lot of big things to Mars. But when NASA successfully sent twin Mars Cube One spacecraft, the size of cereal boxes, in November 2018, it was the first time we’d ever sent something so small.

Just making it this far heralded a new age in space exploration. NASA and the community of planetary science researchers caught a glimpse of a future long sought: a pathway to much more affordable space exploration using smaller, cheaper spacecraft. Read the full story.

—David W. Brown

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ This fascinating tool creates fake video game screenshots in the blink of an eye—give it a whirl.
+ Where and how did the people of the submerged territory of Doggerland live before rising seas pushed them away thousands of years ago? We’re getting closer to learning the answers.
+ Home Alone is a surprisingly brutal movie, as these doctors can attest.
+ Cats love boxes. But why?

Read more

Like many others, I recently fled the social media platform X for Bluesky. In the process, I started following many of the people I followed on X. On Thanksgiving, I was delighted to see a private message from a fellow AI reporter, Will Knight from Wired. Or at least that’s who I thought I was talking to. I became suspicious when the person claiming to be Knight mentioned being from Miami, when Knight is, in fact, from the UK. The account handle was almost identical to the real Will Knight’s handle, and the profile used his profile photo. 

Then more messages started to appear. Paris Marx, a prominent tech critic, slid into my DMs to ask me how I was doing. “Things are going splendid over here,” he replied to me. Then things got suspicious again. “How are your trades going?” fake-Marx asked me. This account was far more sophisticated than Knight’s; it had meticulously copied every single tweet and retweet from Marx’s real page over the past few weeks.

Both accounts were eventually deleted, but not before trying to get me to set up a crypto wallet and a “cloud mining pool” account. Knight and Marx confirmed to us that these accounts did not belong to them, and that they have been fighting impersonator accounts of themselves for weeks. 

They are not the only ones. The New York Times tech journalist Sheera Frankel and Molly White, a researcher and cryptocurrency critic, have also experienced people impersonating them on Bluesky, most likely to scam people. This tracks with research from Alexios Mantzarlis, the director of the Security, Trust, and Safety Initiative at Cornell Tech, who manually went through the top 500 Bluesky users by follower count and found that of the 305 accounts belonging to a named person, at least 74 had been impersonated by at least one other account. 

The platform has had to suddenly cater to an influx of millions of new users in recent months as people leave X in protest of Elon Musk’s takeover of the platform. Its user base has more than doubled since September, from 10 million users to over 20 million. This sudden wave of new users—and the inevitable scammers—means Bluesky is still playing catch-up, says White. 

“These accounts block me as soon as they’re created, so I don’t initially see them,” Marx says. Both Marx and White describe a frustrating pattern: When one account is taken down, another one pops up soon after. White says she had experienced a similar phenomenon on X and TikTok too. 

A way to prove that people are who they say they are would help. Before Musk took the reins of the platform, employees at X, previously known as Twitter, verified users such as journalists and politicians, and gave them a blue tick next to their handles so people knew they were dealing with credible news sources. After Musk took over, he scrapped the old verification system and offered blue ticks to all paying customers. 

The ongoing crypto-impersonation scams have raised calls for Bluesky to initiate something similar to Twitter’s original verification program. Some users, such as the investigative journalist Hunter Walker, have set up their own initiatives to verify journalists. However, users are currently limited in the ways they can verify themselves on the platform. By default, usernames on Bluesky end with the suffix bsky.social. The platform recommends that news organizations and high-profile people verify their identities by setting up their own websites as their usernames. For example, US senators have verified their accounts with the suffix senate.gov. But this technique isn’t foolproof. For one, it doesn’t actually verify people’s identity—only their affiliation with a particular website. 

Bluesky did not respond to MIT Technology Review’s requests for comment, but the company’s safety team posted that the platform had updated its impersonation policy to be more aggressive and would remove impersonation and handle-squatting accounts. The company says it has also quadrupled its moderation team to take action on impersonation reports more quickly. But it seems to be struggling to keep up. “We still have a large backlog of moderation reports due to the influx of new users as we shared previously, though we are making progress,” the company continued. 

Bluesky’s decentralized nature makes kicking out impersonators a trickier problem to solve. Competitors such as X and Threads rely on centralized teams within the company who moderate unwanted content and behavior, such as impersonation. But Bluesky is built on the AT Protocol, a decentralized, open-source technology, which allows users more control over what kind of content they see and enables them to build communities around particular content. Most people sign up to Bluesky Social, the main social network, whose community guidelines ban impersonation. However, Bluesky Social is just one of the services or “clients” that people can use, and other services have their own moderation practices and terms. 

This approach means that until now, Bluesky itself hasn’t needed an army of content moderators to weed out unwanted behaviors because it relies on this community-led approach, says Wayne Chang, the founder and CEO of SpruceID, a digital identity company. That might have to change.

“In order to make these apps work at all, you need some level of centralization,” says Chang. Despite community guidelines, it’s hard to stop people from creating impersonation accounts, and Bluesky is engaged in a cat-and-mouse game trying to take suspicious accounts down. 

Cracking down on a problem such as impersonation is important because it poses a serious problem for the credibility of Bluesky, says Chang. “It’s a legitimate complaint as a Bluesky user that ‘Hey, all those scammers are basically harassing me.’ You want your brand to be tarnished? Or is there something we can do about this?” he says.

A fix for this is urgently needed, because attackers might abuse Bluesky’s open-source code to create spam and disinformation campaigns at a much larger scale, says Francesco Pierri, an assistant professor at Politecnico di Milano who has researched Bluesky. His team found that the platform has seen a rise in suspicious accounts since it was made open to the public earlier this year. 

Bluesky acknowledges that its current practices are not enough. In a post, the company said it has received feedback that users want more ways to confirm their identities beyond domain verification, and it is “exploring additional options to enhance account verification.” 

In a livestream at the end of November, Bluesky CEO Jay Graber said the platform was considering becoming a verification provider, but because of its decentralized approach it would also allow others to offer their own user verification services. “And [users] can choose to trust us—the Bluesky team’s verification—or they could do their own. Or other people could do their own,” Graber said. 

But at least Bluesky seems to “have some willingness to actually moderate content on the platform,” says White. “I would love to see something a little bit more proactive that didn’t require me to do all of this reporting,” she adds. 

As for Marx, “I just hope that no one truly falls for it and gets tricked into crypto scams,” he says. 

Read more
1 9 10 11 12 13 2,503