Microsoft's AI Red Team Adopts Hacker Mindset to Enhance Security

Darius Baruo
Jul 25, 2024 00:47

Microsoft’s AI Red Team employs a hacker’s mindset to identify and mitigate potential generative AI risks, combining cybersecurity and societal-harm assessments.

Generative AI’s new capabilities come with new risks, spurring a novel approach to how Microsoft’s AI Red Team works to identify and reduce potential harm, according to news.microsoft.com.

Origins of Red Teaming

The term “red teaming” was coined during the Cold War, when the U.S. Defense Department conducted simulation exercises with red teams acting as the Soviets and blue teams acting as the U.S. and its allies. The cybersecurity community adopted the language a few decades ago, creating red teams to act as adversaries trying to break, corrupt, or misuse technology — with the goal of finding and fixing potential harms before any problems emerged.

Formation of Microsoft’s AI Red Team

In 2018, Siva Kumar formed Microsoft’s AI Red Team, following the traditional model of pulling together cybersecurity experts to proactively probe for weaknesses, just as the company does with all its products and services. Meanwhile, Forough Poursabzi led researchers from around the company in studies from a responsible AI lens, examining whether the generative technology could be harmful — either intentionally or due to systemic issues in models that were overlooked during training and evaluation.

Collaboration for Comprehensive Risk Assessment

The different groups quickly realized they’d be stronger together and joined forces to create a broader red team that assesses both security and societal-harm risks alongside each other. This new team includes a neuroscientist, a linguist, a national security specialist, and numerous other experts with diverse backgrounds.

Adapting to New Challenges

This collaboration marks a significant shift in how red teams operate, integrating a multidisciplinary approach to tackle the unique challenges posed by generative AI. By thinking like hackers, the team aims to identify vulnerabilities and mitigate risks before they can be exploited in real-world scenarios.

This initiative is part of Microsoft’s broader effort to deploy AI responsibly, ensuring that new capabilities do not come at the expense of safety and societal well-being.

Image source: Shutterstock

Share it on social networks