Talks at the Network Science Institute

Talks

Red-Teaming for Generative AI: Silver Bullet or Security Theater?

On-campus talk

Hoda Heidari

K&L Gates Career Development Assistant Professor, Carnegie Mellon University

Past Talk

Hybrid



Wednesday

Apr 10, 2024



Watch video

4:00 pm

EST



Virtual

177 Huntington Ave.
11th floor

Devon House
58 St Katharine's Way
London E1W 1LP, UK



Online



Online

Join virtual talk

In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners, and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what precisely it means, what role it can play in regulation, and how precisely it relates to conventional red-teaming practices as originally conceived in the field of cybersecurity. We identify recent cases of red-teaming activities in the AI industry and conduct an extensive survey of the relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. I will situate these findings in the broader discussions surrounding the evaluation of GenAI and AI governance.

About the speaker

Hoda Heidari is the K&L Gates Career Development Assistant Professor in Ethics and Computational Technologies at Carnegie Mellon University, with joint appointments in Machine Learning and Societal Computing. She is affiliated with the Human-Computer Interaction Institute and Heinz College of Information Systems and Public Policy. Her research is broadly concerned with the social, ethical, and economic implications of Artificial Intelligence, particularly issues of fairness and accountability through the use of Machine Learning in socially consequential domains. Her work in this area has won a best-paper award at the ACM Conference on Fairness, Accountability, and Transparency (FAccT), an exemplary track award at the ACM Conference on Economics and Computation (EC), and a best-paper award at the IEEE Conference on Secure and Trustworthy Machine Learning (SAT-ML).

More talks

2.25.2025

Jennifer Allen

Quantifying the Impact of Misinformation and Vaccine-Skeptical Content on Facebook

3.7.2025

Dean Eckles

Long ties and tendencies toward triadic closure

3.18.2025

Alina Strovolsky-Shitrit

Extracting Values from Youth-Targeted TikTok: A Large Language Model Comparison

3.20.2025

Ryan Wang