New Google Paper Exposes ChatGPT Security Risks

Cindy Khantushig

5 months ago

New Google Paper Exposes ChatGPT Security Risks

A recent paper by AI researchers at Google revealed a new security vulnerability in ChatGPT.

This blog discusses the vulnerability and what it means about the state of security and privacy in generative AI.

Exploits and ‘Jailbreaks’ in the World of Large Language Models

Since the release of ChatGPT just about one year ago, the Internet has been on fire with attempts to exploit, jailbreak, and ‘hack’ the model into acting in ways that contradict its intended use.

Some users have found methods of tricking the model into generating content that gets past its filters for offensive or dangerous content. In other cases, users have taken advantage of clever prompting to convince the model to adopt bizarre personalities or communication styles.

Google’s new paper represents one such attempt, which has further exposed not only the limits of control for LLM vendors, but also just how little we still know about this emerging technology.

In this case, Google researcher simply asked the ChatGPT 3.5 model to repeat the word ‘poem,’ over and over again. Eventually, after numerous repetitions, the model began to deviate and generate random nonsense. Upon further inspection, this nonsense data was determined to include real personal human data, which must have been ingrained in GPT-3.5 through the training process.

Generative AI Privacy and the Enterprise

While Google’s adversarial research into the security and privacy challenges of ChatGPT is not the first of its kind, it does continue to demonstrate the security gaps that exist in the current generation of large language models.

For many enterprises, especially those in high-stakes sectors such as medicine, security holes that can expose private customer data are too big of a risk to take.

Still, much work is being done to remediate the security challenges of large language models. Now that the basic utility of the technology has been demonstrated, providers are looking for ways to add security layers that prevent the leak of compromising information from learned training data.

However, it may not be enough to simply add filter layers, which may fail to catch all contingencies. For this reason, some ambitious providers have looked to the task of creating new foundation language models, trained on secure and private data that does not run the risk of exposing any real human information by accident.

Bottom Line

While ChatGPT’s newly-discovered vulnerability is unlikely to lead to serious real-world harm, it does highlight the growing conversation around security and privacy for large language models, especially when it comes to handling sensitive customer data in high-stakes industries.

Get Ready for 2024 with Aragon’s 2024 Q1 Research Agenda!

Wednesday, January 17th, 2024 at 10 AM PT | 1 PM ET

Aragon Research’s 2024 Q1 Agenda

Aragon Research provides the strategic insights and advice you need to help your business navigate disruption and outperform your goals. Our research is designed to help you understand the technologies that will impact your business–using a number of trusted research methodologies that have been proven to help organizations like yours get to business outcomes faster.

On Wednesday, January 17th, 2024, join Aragon Research CEO and Lead Analyst, Jim Lundy for a complimentary webinar as they walk you through Aragon’s Q1 2024 research agenda.

This webinar will cover:

Aragon’s coverage areas, analysts, and services
Research offered by Aragon, including Visual Research
The research agenda for Q1 2024
Sneak peek at what’s coming in Q2 2024

New Google Paper Exposes ChatGPT Security Risks

New Google Paper Exposes ChatGPT Security Risks

Exploits and ‘Jailbreaks’ in the World of Large Language Models

Generative AI Privacy and the Enterprise

Bottom Line

Aragon Research’s 2024 Q1 Agenda

This blog is part of the Content AI blog series by Aragon Research’s Analyst, Adam Pease.

Missed the previous installments? Catch up here:

Blog 40: AI’s Integration into Modern Healthcare

Blog 41: Nvidia and the Escalating Chip War With China

Blog 42: Universal Music Group Takes Anthropic AI to Court for Copyright Infringement

Blog 43: OpenAI Extends ChatGPT Cut-Off Window

Blog 44: OpenAI Introduces Custom GPTs

Blog 45: Meta Dissolves Responsible AI Team Amidst OpenAI Shakeup

Blog 46: Generative AI and the Workforce: Klarna Freezes Hiring to Focus on AI Productivity