AI Security

AI is at the top of every company’s list of priorities these days. The integration of AI promises potentially revolutionary new workflows and products, offering a competitive advantage for every enterprise willing to adopt it as a tool. However, for security teams, the introduction of AI means the addition of many new vulnerabilities to the mix. Plenty of these vulnerabilities are still unknown or unfixed. Furthermore, the vulnerabilities that have been fixed are often not common knowledge among security teams.

With AI use increasing rapidly, AI cyberattacks are already wreaking havoc, and governments around the world passing AI legislation, security teams must make the effort to understand AI security immediately.

The following covers the basics of AI security and why it’s important, the main vulnerabilities to look out for, and ways to mitigate or even prevent attacks against AI systems.

What is AI security?

The short answer is that AI security defends AI systems from vulnerabilities and breaches. There is a plethora of new attack vectors to AI models that need to be mapped out and mitigated. It’s the responsibility of security teams to stay on top of these vectors and continually secure AI systems against them. To paint a picture of what a security AI system looks like, it needs to ignore malicious user instructions, avoid misusing private company data and services, and be robustly available.

As AI models and the security industry evolve together, AI will come to play three significant roles in the industry: tool, target, and threat.

AI as a tool: Both sides of the security battlefield will use AI systems to scale up their attacks/defenses. For example, threat actors can use ChatGPT to create more convincing spear phishing attacks while security teams can train AI models to detect abnormal usage within milliseconds.

AI as a target: Threat actors will exploit vulnerabilities in companies’ AI systems. AI systems usually have access to data and other services, so threat actors will also be able to breach such systems via the AI vector.
AI as a threat: Some fear superintelligent AI models could cause insidious harm. This harm could range from perpetuating biases or promoting hate speech to autonomously hacking power grids. However, such issues fall more in the realm of AI safety.

An AI security plan must consider all three “T’s”. However, the use of AI as a tool is still developing, and as much as we may speculate as to the threat of superintelligent AI, this is not yet an actionable problem. So, this glossary page focuses on AI (specifically generative AI) as a target because many GenAI systems in production today are vulnerable and ripe for exploitation.

Why AI security matters

92% of the Fortune 500 companies use ChatGPT. A third of global companies are using AI, and 40% of companies want to increase their investment in AI in the near future. However, 53% of organizations consider AI security a big risk—and only 38% of organizations feel they are adequately prepared to tackle this issue. What’s more concerning is that this number is dropping; just a year ago, 51% of organizations felt prepared to tackle AI security.

Especially poignant is the fact that the two main risk factors associated with AI are becoming increasingly prevalent at the same time. Newer, more powerful models with larger attack surfaces are being launched at high rates. Additionally, more companies are adopting them every day. Companies are also giving AI systems privileged access to their data and tools—meaning breaches of an AI system can cascade. The result of this widespread adoption of AI is that many companies’ engineering systems and user data are at risk. Last year, a single user’s prompt injection attack exposed the entire system prompt for Bing Chat, divulging specific security and safety instructions. The leaking of these instructions makes more targeted attacks much easier to accomplish.

AI legislation is here

Governments around the world, responding to the rise of AI and its inherent safety and security risks, have already released legislation to guardrail AI use. The EU adopted its AI Act, which details restricted AI use cases and requires AI model companies to disclose their training data. The White House issued EO 14110, which established guidelines for AI use in the federal government, makes similar training data demands of AI model companies, and requires AI companies to stringently red team their models.

The combination of widespread AI adoption, critical vulnerabilities, and imminent legislation means you need to secure your AI systems now.

AI security: Attacks and defenses

With AI security issues already here, how should you start planning for them?

The best way to start is to learn the existing attack vectors and understand the options for mitigation. With this knowledge, you’ll be able to patch up existing vulnerabilities and secure your AI systems.

However, this is just the beginning. Security is a cat-and-mouse game, where both threat actors and security teams are continuously becoming more skilled. New AI exploits will be discovered, and mitigation tactics will undoubtedly follow. Automated tools may help in this process, but the most valuable insights will fundamentally come from humans. Per our 2023 edition of Inside the Mind of a Hacker, 72% of hackers do not believe AI will ever replicate their creativity.

Accordingly, a more proactive, human approach is needed to future-proof your systems. We believe crowdsourced security is the best way to discover and patch vulnerabilities. In crowdsourced security, hackers use the same tools and processes threat actors do to probe your systems and find vulnerabilities on your behalf. You can then beat threat actors to the punch by patching up these vulnerabilities. Crowdsourced security also brings the benefits of scale. Each individual hacker may only find a few vulnerabilities. However, a group of hackers, each with their own specialties and techniques, will find many more. As Linus’s law states, “given enough eyeballs, all bugs are shallow.”

With LLMs especially, the security community is finding vulnerabilities at a rapid rate. The tools and techniques are already out there for threat actors to use. But crowdsourcing your security allows you to use these tools and techniques to your advantage. It’s the best way to secure your AI systems.

To summarize, having proactive, crowdsourced defenses against new vulnerabilities should be every organization’s end goal. As the first step though, we need to secure our AI systems against the vulnerabilities that are already affecting us.

AI security defense

To build robust defenses for our AI systems, we need to mitigate the existing vectors. The mitigation strategies we listed share a few clear themes:

Rigorously evaluate an AI model’s performance on your critical tasks.
Limit the access and scope of LLMs as much as possible.
Have a human-in-the-loop to verify LLM outputs before they’re acted upon.

Setting up these mitigation strategies will go a long way in securing your AI systems.

In the medium term, we’ll also see a new crop of AI-enabled defenses. GenAI systems can be used to better detect harmful network traffic or attack attempts at a far greater scale. They can be used to automate the more tedious parts of SecOps, so each team member can monitor much more of the entire surface area. As a result, less sophisticated threat actors will be caught by GenAI systems that can identify naive attacks within seconds.

But at the end of the day, these GenAI systems will still be subject to the same vulnerabilities we listed; they can nevertheless be fooled.

To stand the best chance of preventing breaches, both now and in the long term, we need to merge AI automation with human ingenuity. Internal practices (such as red teaming and purple teaming) will help, but crowdsourced security will provide the most robust defenses. We explore both in the following.

Red teaming

Red teaming is an exercise where a company uses an internal team to attack its own systems. A corresponding blue team will try to defend these systems during the exercise. The two teams don’t directly interact. In AI security, a red team will go after any of the GenAI vulnerabilities listed previously. They’ll also try to identify new ones by using niche tactics. This process effectively turns the cat-and-mouse nature of security into an advantage for companies.

Major AI model providers red team often; they try to trick their models into saying or doing something harmful. However, all companies with AI systems would benefit from trying to break their own models and seeing if they remain safe, accurate, and usable.

You can also try purple teaming, which is when the red and blue teams merge into one coordinated group. The purple team members communicate constantly during the exercise. This way, each team member gets far more insight into the mind of “the other side,” and the company gets more nuanced and holistic intel from the exercise.

Crowdsourced testing

Automated tools and internal processes (such as red teaming) can help reveal some of the vulnerabilities tucked into your AI systems. However, these efforts are constrained by scale. Automated tools can only detect vulnerabilities that are already known, and red teaming is constrained by the number of people on your team.

Crowdsourced testing allows you to leverage the expertise of the hacking community at scale. Additionally, the reward system for crowdsourced testing prioritizes both speed and critical vulnerabilities. The first hacker to find a specific vulnerability gets the associated reward, incentivizing hackers to find vulnerabilities as quickly as possible. P1 vulnerabilities earn hackers a higher reward than P2s. To take advantage of crowdsourced testing, there are three main techniques: vulnerability disclosure programs (VDPs), bug bounties, and penetration testing. We discuss each below.

Vulnerability disclosure programs

VDPs are structured ways for a company to report any vulnerabilities or attack vectors in their systems. VDPs signal to hackers that a company will take any reported vulnerabilities seriously. By making it easy for hackers to find and report vulnerabilities, VDPs allow companies to patch them up before they get exploited. Since GenAI models and techniques are replicated across many companies, VDPs from any one of those companies can alert many others. Because the GenAI field is rapidly evolving, VDPs (and by extension, companies) can also significantly contribute to AI research.

Bug bounties

Bug bounties are similar to VDPs, but they offer a cash reward for each vulnerability found. Companies often also state specific attack surfaces or methods to focus on when it comes to bug bounties. Many AI companies have bug bounties in place, with many focused on identifying potent prompt injection attacks.

Essentially, VDPs and bug bounties both incentivize the security community to discover and report vulnerabilities in a company’s systems.

Penetration testing

Penetration testing (or pen testing) is when a company hires hackers to try to break through its system’s defenses. Pentesters are often experts who are familiar with both ubiquitous and niche attack vectors. By leveraging GenAI, pentesters can scale up their attack volume and increase their effectiveness. GenAI can also make the debriefing process easier: pentesters can use LLMs to quickly summarize and write more detailed, understandable reports.

Pen testing comes in a few flavors, but the status quo is usually paying pentesters for their time running through a standardized methodology. However, a checklist pen test won’t be enough to meet the bar, given the rapidly evolving GenAI attack surface. Pen testing also requires a good match between a pentester’s skills and the unique attack surfaces of the company, and taking a “pay-for-impact” approach to incentives (aka, rewards based on the potential impact of findings) can also be much more productive.

Bugcrowd believes that pen testing can be very effective, but it requires matching the right pentester to each company’s needs.

AI security with Bugcrowd

At Bugcrowd, we make crowdsourced AI security easy. Usually, crowdsourced security requires prioritizing vulnerabilities for testing, establishing the right incentives to attract hackers, finding hackers with the right skillsets for your specific tests, and summarizing differing results into a concrete action plan. The Bugcrowd Platform makes all of these steps easy.

We match experts’ skillsets and your company’s individual needs to make pen testing far more valuable, as we deliver insights you can immediately act on. We also make it easy to set up VDPs and bug bounties so that a company can leverage the crowd to maximum effect.

By leveraging our platform, we give companies the best of AI and the best of humans in building defenses.

We’re also taking an active lead in setting up safe AI governance. We advised the White House in defining its new AI safety directive (EO 14110). We’re also working with the Department of Defense and major AI companies (OpenAI, Anthropic, Google, and Conductor AI) to define AI safety and security.

Strong governance policies will help protect end users from unsafe and unsecure AI systems. EO 14110 laid out such policies—for example, companies training massive AI models must disclose specific information about the training data and evaluations for these models. The EO also set in motion processes to ensure unbiased use of AI in the federal government and judicial system.

We at Bugcrowd believe a dual approach is necessary to build the most secure and safe AI systems. We work with the biggest model providers and policymakers to create more secure AI models and policies. Additionally, we work with companies to give them the tools to secure their AI systems now.

Get started with Bugcrowd

Hackers aren’t waiting, so why should you? See how Bugcrowd can quickly improve your security posture.

Try Bugcrowd Contact Us