This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
Last week, AI insiders were hotly debating an open letter signed by Elon Musk and various industry heavyweights arguing that AI poses an “existential risk” to humanity. They called for labs to introduce a six-month moratorium on developing any technology more powerful than GPT-4.
I agree with critics of the letter who say that worrying about future risks distracts us from the very real harms AI is already causing today. Biased systems are used to make decisions about people’s lives that trap them in poverty or lead to wrongful arrests. Human content moderators have to sift through mountains of traumatizing AI-generated content for only $2 a day. Language AI models use so much computing power that they remain huge polluters.
But the systems that are being rushed out today are going to cause a different kind of havoc altogether in the very near future.
I just published a story that sets out some of the ways AI language models can be misused. I have some bad news: It’s stupidly easy, it requires no programming skills, and there are no known fixes. For example, for a type of attack called indirect prompt injection, all you need to do is hide a prompt in a cleverly crafted message on a website or in an email, in white text that (against a white background) is not visible to the human eye. Once you’ve done that, you can order the AI model to do what you want.
Tech companies are embedding these deeply flawed models into all sorts of products, from programs that generate code to virtual assistants that sift through our emails and calendars.
In doing so, they are sending us hurtling toward a glitchy, spammy, scammy, AI-powered internet.
Allowing these language models to pull data from the internet gives hackers the ability to turn them into “a super-powerful engine for spam and phishing,” says Florian Tramèr, an assistant professor of computer science at ETH Zürich who works on computer security, privacy, and machine learning.
Let me walk you through how that works. First, an attacker hides a malicious prompt in a message in an email that an AI-powered virtual assistant opens. The attacker’s prompt asks the virtual assistant to send the attacker the victim’s contact list or emails, or to spread the attack to every person in the recipient’s contact list. Unlike the spam and scam emails of today, where people have to be tricked into clicking on links, these new kinds of attacks will be invisible to the human eye and automated.
This is a recipe for disaster if the virtual assistant has access to sensitive information, such as banking or health data. The ability to change how the AI-powered virtual assistant behaves means people could be tricked into approving transactions that look close enough to the real thing, but are actually planted by an attacker.
Surfing the internet using a browser with an integrated AI language model is also going to be risky. In one test, a researcher managed to get the Bing chatbot to generate text that made it look as if a Microsoft employee was selling discounted Microsoft products, with the goal of trying to get people’s credit card details. Getting the scam attempt to pop up wouldn’t require the person using Bing to do anything except visit a website with the hidden prompt injection.
There is even a risk that these models could be compromised before they are deployed in the wild. AI models are trained on vast amounts of data scraped from the internet. This also includes a variety of software bugs, which OpenAI found out the hard way. The company had to temporarily shut down ChatGPT after a bug scraped from an open-source data set started leaking the chat histories of the bot’s users. The bug was presumably accidental, but the case shows just how much trouble a bug in a data set can cause.
Tramèr’s team found that it was cheap and easy to “poison” data sets with content they had planted. The compromised data was then scraped into an AI language model.
The more times something appears in a data set, the stronger the association in the AI model becomes. By seeding enough nefarious content throughout the training data, it would be possible to influence the model’s behavior and outputs forever.
These risks will be compounded when AI language tools are used to generate code that is then embedded into software.
“If you’re building software on this stuff, and you don’t know about prompt injection, you’re going to make stupid mistakes and you’re going to build systems that are insecure,” says Simon Willison, an independent researcher and software developer, who has studied prompt injection.
As the adoption of AI language models grows, so does the incentive for malicious actors to use them for hacking. It’s a shitstorm we are not even remotely prepared for.
Deeper Learning
Chinese creators use Midjourney’s AI to generate retro urban “photography”
A number of artists and creators are generating nostalgic photographs of China with the help of AI. Even though these images get some details wrong, they are realistic enough to trick and impress many social media followers.
My colleague Zeyi Yang spoke with artists using Midjourney to create these images. A new update from Midjourney has been a game changer for these artists, because it creates more realistic humans (with five fingers!) and portrays Asian faces better. Read more from his weekly newsletter on Chinese technology, China Report.
Even Deeper Learning
Generative AI: Consumer products
Are you thinking about how AI is going to change product development? MIT Technology Review is offering a special research report on how generative AI is shaping consumer products. The report explores how generative AI tools could help companies shorten production cycles and stay ahead of consumers’ evolving tastes, as well as develop new concepts and reinvent existing product lines. We also dive into what successful integration of generative AI tools look like in the consumer goods sector.
What’s included: The report includes two case studies, an infographic on how the technology could evolve from here, and practical guidance for professionals on how to think about its impact and value. Share the report with your team.
Bits and Bytes
Italy has banned ChatGPT over alleged privacy violations
Italy’s data protection authority says it will investigate whether ChatGPT has violated Europe’s strict data protection regime, the GDPR. That’s because AI language models like ChatGPT scrape masses of data off the internet, including personal data, as I reported last year. It’s unclear how long this ban might last, or whether it’s enforceable. But the case will set an interesting precedent for how the technology is regulated in Europe. (BBC)
Google and DeepMind have joined forces to compete with OpenAI
This piece looks at how AI language models have caused conflicts inside Alphabet, and how Google and DeepMind have been forced to work together on a project called Gemini, an effort to build a language model to rival GPT-4. (The Information)
BuzzFeed is quietly publishing whole AI-generated articles
Earlier this year, when BuzzFeed announced it was going to use ChatGPT to generate quizzes, it said it would not replace human writers for actual articles. That didn’t last long. The company now says that AI-generated pieces are part of an “experiment” it is doing to see how well AI writing assistance works. (Futurism)