Understanding and Reducing Safety Risks of Learning with Large Pre-Trained Models: Center for Advancing Safety of Machine Intelligence

Home
Research
Projects
Understanding and Reducing Safety Risks of Learning with Large Pre-Trained Models

Understanding and Reducing Safety Risks of Learning with Large Pre-Trained Models

PI: Sharon Li

Assistant Professor in Computer Sciences
University of Wisconsin-Madison

Framework component: Algorithm

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, GPT-3, CLIP) that are trained on massive data and are adaptable to a wide range of downstream tasks. It is now the common practice of the AI community to adopt pre-trained models for transfer learning rather than learning from scratch. Without appropriately understanding the safety risks, development can exacerbate and propagate safety concerns writ large, causing profound impacts on society.

In response to these urgent challenges, the overall objective of this project is to understand and reduce the risks of learning with large pre-trained models (Safe L²M). This project seeks to address the research questions that arise in building responsible and ethical AI models: What is the extent of inequity and out-of-distribution risks when performing transfer learning from large pre-trained models? And how can we design learning algorithms to de-risk the potential negative impact? This project will make the following contributions to operationalizing safe machine intelligence: (1) We propose a novel evaluation framework to comprehensively understand how the two types of risks are propagated through the transfer learning process; (2) Building on the understanding, we then propose new learning algorithms that enhance safety when transferring knowledge from pre-trained models.

Key Personnel

Rheeya Uppaal
Graduate Student, Computer Science
University of Wisconsin-Madison

CENTER FOR ADVANCING SAFETY
OF MACHINE INTELLIGENCEA collaboration between Northwestern University and UL Research Institutes