Outlining the risks associated with running AI systems has become a high priority for the tech industry as many companies race to integrate AI into their stacks and product offerings. The Open Worldwide Application Security Project (OWASP) released its first Top 10 for Large Language Model Applications last year, highlighting the following security vulnerabilities for LLMs.
This is part of a wider project aiming to educate developers, designers, architects, managers, and organizations about potential security risks.
OWASP Top 10 for Large Language Model Applications
Prompt injection: Prompt injection occurs when a threat actor inserts malicious instructions into the prompt of a gen AI model. The model processes the instructions within the prompt, potentially executing the malicious instruction, performing unintended or unauthorized actions. Direct prompt injections are attacks in which the malicious instructions are included in the prompt text, indirect prompt injections are those in which the AI accepts input from other sources, such as a file, image, website, or third-party plugin. In the latter case, a threat actor can include the malicious instructions in the external sources, leading the AI to become a “confused deputy,” granting the attacker the ability to influence the user or other systems accessible to the LLM.
Insecure output handling: Insecure output handling, just like general web security, is the practice of mishandling or failing to filter the output generated by web applications before presenting it to users. Especially if the output of the AI is fed to another system, failing to securely handle output can lead to security vulnerabilities and risks, such as XSS, SSRF, privilege escalation, and remote-code execution.
Model denial of service: Model denial of service (DoS) is when malicious actors disrupt an AI program by overwhelming it with a high volume of requests or exploiting vulnerabilities in its design. In the case of generative AI, for example, a malicious actor can overwhelm it with too many queries or send queries that use a lot of compute power. Like a DoS on traditional web services, this attack can slow down an AI model and its supporting infrastructure or bring it to a complete halt for legitimate users.
Insecure plugin design: Plugins for LLMs can suffer from weak input validation and access control mechanisms, creating risks for the AI. Attackers can exploit faulty plugins and cause severe damage by, for example, executing malicious code.
Excessive agency: Excessive agency is a vulnerability that arises when a developer grants an LLM is granted too much autonomy or decision-making power. In this case, the AI may have the power to interface with other systems and take actions without sufficient safeguards or constraints in place. Excessive agency enables potentially damaging actions to be performed in response to unexpected, ambiguous, or malicious outputs from the LLM.
Sensitive information disclosure: When prompted with certain inputs, an LLM may inadvertently reveal confidential data, such user data, company data, or training data, in their responses. These disclosures are serious privacy violations and security breaches.
Training data poisoning: LLMs are trained and fine-tuned on large amounts of training data. A threat actor can modify a dataset to affect the AI’s responses and ability to produce correct output before the model training is complete. This risk is higher when a third-party dataset is used to train a model, which is common given the difficulty in creating and obtaining large amounts of good data.
Overreliance: There are risks associated with excessively relying on an LLM, which can produce false information. An example of overreliance is when a developer uses insecure or faulty code suggested by an LLM without additional oversight.
Model theft: Model theft can occur when someone gains unauthorized access to proprietary LLMs. They might copy or steal the models, leading to financial losses for the company, weakening their competitive position, and potentially exposing sensitive data.
Supply chain vulnerabilities: Adding third-party datasets, plugins, and pretrained models to an AI application come with certain risks. These components are susceptible to many of the aforementioned risks, and you have significantly less control over how they are built and assessed.
Prioritizing AI Vulnerabilities
To help organizations prioritize these vulnerabilities, the Bugcrowd VRT is an open-source, industry-standard taxonomy that aligns customers and hackers on a common set of risk priority ratings for commonly seen vulnerabilities and edge cases.
In December, we updated the VRT to include AI security-related vulnerabilities. This is timely, considering the recent government regulation. These updates overlap with the OWASP Top 10 for Large Language Model Applications.
To learn more on security vulnerabilities in AI, see our Ultimate Guide to AI security.