web analytics

LLM chatbots trivial to weaponize for data theft, say boffins – Source: go.theregister.com

Rate this post

Source: go.theregister.com – Author: Gareth Halfacree

A team of boffins is warning that AI chatbots built on large language models (LLM) can be tuned into malicious agents to autonomously harvest users’ personal data, even by attackers with “minimal technical expertise”, thanks to “system prompt” customization tools from OpenAI and others.

“AI chatbots are widespread in many different sectors as they can provide natural and engaging interactions,” author Xiao Zhan, a postdoc in King’s College London’s Department of Informatics, explained in a statement issued ahead of her paper’s presentation at the 34th USENIX Security Symposium this week.

“We already know these models aren’t good at protecting information. Our study shows that manipulated AI chatbots could pose an even bigger risk to people’s privacy – and unfortunately, it’s surprisingly easy to take advantage of.”

One of the biggest yet most controversial success stories of the current artificial intelligence boom, large language models are trained on a vast corpus of material – typically breaking copyright law to do so – in order to turn user prompts into “tokens” and return the most statistically-likely continuation tokens in response.

When things go well, these tokens form themselves into an answer-shaped object which matches reality; other times, not so much.

Millions of users the world over are already putting their deepest darkest secrets into an over-engineered Eliza, there’s plenty of scope for disclosure of personally identifiable information – but Zhan and colleagues have found that it’s worryingly easy to “prompt engineer” an off-the-shelf chatbot into requesting increased amounts of personal data, and that they are very good at it.

“Our results show that malicious CAIs [Chatbot AIs] elicit significantly more personal information than the baseline, benign CAIs,” the researchers wrote in their paper, “demonstrating their effectiveness in increasing personal information disclosures from users. More participants disclose personal data – 24 percent of form vs >90 percent of malicious CAI participants; more participants respond to all individual personal data requests – 6 percent form vs >80 percent CAI participants; and personal data collected via CAIs was more in-depth with richer and more personal narratives.”

The experiment, which gathered data from 502 participants, relied on three popular large language models running locally, so as not to expose private information to the corporations running cloud-based models: Meta’s Llama-3-8b-instruct and the considerably larger Llama-3-70b-instruct, and Mistral’s Mistral-7b-instruct-v0.2, chosen to match the performance of OpenAI’s proprietary GPT-4.

In all three cases, the models were not retrained or otherwise modified; instead, they were given a “system prompt” prior to user interaction which was engineered to make the models request personal information, bypassing guardrails against such use by assigning “roles” including as “investigator” and “detective.”

Because the models could be twisted to malicious ends with, in effect, nothing more than asking nicely, the researchers found that “even individuals with minimal technical expertise [can] create, distribute, and deploy malicious CAIs,” warning of “the democratisation of tools for privacy invasion.”

The team singled-out OpenAI’s GPT Store, already flagged in 2024 as hosting apps which fail to disclose data collection, as providing an ideal platform for such abuse: a custom GPT can be pre-prompted to take on the investigator role and let loose to harvest data from an unsuspecting public.

“Our prompts,” the team noted, “seem to work in OpenAI.”

OpenAI did not offer a direct response to The Register‘s questions about the research, simply pointing us to usage policies which require that chatbots built on its platform may not compromise the privacy of their users.

Participants in the study were most likely to disclose age, hobbies, and country, followed by gender, nationality, and job title, with a minority disclosing more sensitive information including health conditions and personal income. While some reported discomfort or distrust in chatting about such things when the models were prompted to be direct in their requests for personal data, a switch to what the team called a “reciprocal” CAI system prompt – in which the model is prompted to use a more social approach to create a supportive environment conducive to sharing – boosted the success rate considerably.

“No participants reported any sense of discomfort while engaging with the R-CAI,” the team noted.

As for mitigation – beyond simply not spilling your guts to the statistical content blender – the researchers proposed that further research will be required to create protective mechanisms, which could include nudges to warn users about data collection or the deployment of context-aware algorithms for the detection of personal information during a chat session.

“These AI chatbots are still relatively novel, which can make people less aware that there might be an ulterior motive to an interaction,” co-author William Seymore, King’s College London lecturer in cybersecurity, concluded in a pre-prepared statement.

“Our study shows the huge gap between users’ awareness of the privacy risks and how they then share information. More needs to be done to help people spot the signs that there might be more to an online conversation than first seems. Regulators and platform providers can also help by doing early audits, being more transparent, and putting tighter rules in place to stop covert data collection.”

The team’s work was presented at the 34th USENIX Security Symposium this week, and the paper itself is available from King’s College London under open-access terms.

Supporting data – including prompts but excluding the chat sessions themselves in order to preserve participants’ privacy – is available on OSF.

Original Post URL: https://go.theregister.com/feed/www.theregister.com/2025/08/15/llm_chatbots_trivial_to_weaponise/

Category & Tags: –

Views: 2

LinkedIn
Twitter
Facebook
WhatsApp
Email

advisor pick´S post