The National Cybersecurity Centre, an arm of the UK's intelligence agency GCHQ, has recently issued warnings about the prevalence of such attacks, stating that prompt injection may be an inherent issue with LLM technology, for which no fool proof mitigations currently exist. These attacks entail the manipulation of user inputs or prompts to induce unintended responses from the underlying language models powering chatbots.
Chatbots, operating on artificial intelligence (AI), have become integral components of various online services, including online banking and shopping platforms. These digital assistants simulate human-like interactions by generating responses to user inputs, leveraging extensive datasets for training purposes.
The NCSC emphasises that the vulnerability of chatbots lies in their capacity to transmit data to third-party applications and services. Malicious bad actors using prompt injection techniques can lead to unintended actions by chatbots, ranging from generating offensive content to disclosing confidential information.
In February of this year, security researchers conducted an experiment that exposed the vulnerability of LLMs to manipulation. One chatbot was manipulated to impersonate a scammer, with the chatbot unknowingly soliciting sensitive bank account details from users. This experiment highlighted a new type of threat: indirect prompt injection attacks.
Prompt injection attacks have the potential to result in severe real-world consequences if not addressed adequately. These consequences encompass attacks, scams, and data theft – all of which can adversely affect individuals, organisations, and society at large. As the adoption of LLMs continues to grow, the risks associated with malicious prompt injection are poised to escalate.
Illustrative examples underscore the gravity of prompt injection attacks resulting in unauthorised access to prompts resulting in serious privacy and security concerns. Another example resulted in an AI powered chatbot being manipulated to respond to new prompts not initially requested. By injecting prompts the researchers uncovered vulnerabilities that could lead to indirect prompt injection attacks, potentially compromising sensitive data.
From a legal standpoint, the implications of prompt injection attacks are multifaceted. Affected parties may seek recourse under various legal frameworks, including data protection regulations, consumer protection laws, fraud, intellectual property infringement and contract law. Potential legal consequences for entities responsible for chatbot deployment include regulatory fines, liability for data breaches, and reputational damage amongst some of the other legal risks.
The NCSC advocates for a proactive approach to cybersecurity, emphasising that prompt injection and data poisoning attacks can be challenging to detect and mitigate. To safeguard against catastrophic failures resulting from malicious prompt injection, include for example a cybersecurity strategy to implement rules-based systems to prevent chatbots from executing damaging actions, even when prompted to do so.
Leading technology companies, have already embarked on initiatives to tackle the indirect prompt injection threat, such as specially trained models to identify malicious inputs and outputs that violate established social media policies, and another is introducing guardrails aimed at adding restrictions to models.
Prompt injection attacks on chatbots pose significant cybersecurity risks with potential real-world consequences. As the use of LLMs and chatbots continues to expand, understanding and addressing these vulnerabilities is imperative. By taking a proactive stance, designing for security, and remaining vigilant in the face of emerging threats, entities can mitigate the legal and operational risks associated with prompt injection attacks and ensure the integrity and reliability of their chatbot systems.
If you wish to seek further advice on any of the issues discussed please contact Olivia O'Kane.