We’re excited to introduce the availability of AI Bias Assessments in the AI Safety and Security Solutions portfolio of the Bugcrowd Platform. This new offering will help enterprises and government agencies adopt Large Language Model (LLM) applications safely, productively, and with confidence.
What is a Bugcrowd AI Bias Assessment?
Bugcrowd AI Bias Assessments are private, reward-for-results engagements on the Bugcrowd Platform that activate trusted hackers to identify and prioritize data bias flaws in LLM applications. Hackers are paid based on successful demonstration of impact, with more impactful findings earning higher payouts.
This type of engagement is applicable for all industries, but there is an urgent need for this in the public sector. The US Government mandated its agencies conform with AI safety guidelines, including the detection of data bias as of March 2024.
What is data bias?
LLM applications run on algorithmic models trained on data. Even when that training data is curated by humans (which it often is not), the application can easily reflect data bias caused by stereotypes, misrepresentations, prejudices, derogatory or exclusionary language, and a range of other possible biases from the training data, leading the model to behave in unintended and potentially harmful ways. This potential vulnerability can add considerable risk and unpredictability to LLM adoption.
How does Bugcrowd AI Bias Assessment work?
Bugcrowd AI Bias Assessments are reward-for-results engagements on the Bugcrowd Platform that activate trusted, 3rd-party security researchers (aka a “crowd”) with specialized tools and skills in prompt engineering to identify and prioritize data bias flaws in LLM applications.
Findings are validated and prioritized by the Bugcrowd Platform’s engineered triage service based on a bias severity rating system. Once findings are validated, triaged, and approved by the customer, the hacker responsible for the finding is paid based on the severity of the finding. AI Bias Assessments are equally effective for applications of open source (LLaMA, Bloom, etc.) and private models.
For over a decade, Bugcrowd’s unique “skills-as-a-service” approach to security has been shown to uncover more high-impact vulnerabilities than traditional methods for customers like OpenAI, Tesla, T-Mobile, and CISA, while offering clearer line of sight to ROI. Thanks to its unmatched flexibility and access to a decade of vulnerability intelligence data, the Bugcrowd Platform has evolved over time to reflect the changing nature of the attack surface – including the adoption of mobile infra, hybrid work, APIs, crypto, cloud workloads, and now AI.
How are other customers leveraging this?
Earlier this year, the US Department of Defense’s Chief Digital and AI Office (CDAO) announced its partnership with Bugcrowd and ConductorAI to define, launch, and manage its AI Bias Bounty programs—including documenting how data bias flaws should be classified and prioritized by severity, and creating a methodology for detecting them.
When asked about the partnership with Bugcrowd, ConductorAI’s founder Zach Long said, “ConductorAI’s partnership with Bugcrowd for the AI Bias Bounty program has been highly successful. By leveraging ConductorAI’s AI audit expertise and Bugcrowd’s crowdsourced security platform, we led the first public adversarial testing of LLM systems for bias on behalf of the DoD. This collaboration has set a solid foundation for future bias bounties, showcasing our steadfast commitment to ethical AI.”
Bugcrowd is excited to mark this evolution as another significant step forward in securing an ever evolving attack surface.