Diagnosing, Understanding, and Fixing Data Biases for Trusted Data Science
PI: Romila Pradhan
Assistant Professor in Computer and Information TechnologyPurdue University
Framework component: Data
The thesis of this proposal is that data preparation tasks tailored to downstream machine learning (ML) applications can serve as a basis for detecting and mitigating algorithmic bias. The overarching goal of this project is to demonstrate value for the importance of data quality in establishing public trust in data-driven decision making, in part by tracing bias in ML models and pipelines. The project will investigate how to diagnose bias, as well as how to evaluate and integrate the impact of data quality on bias in downstream ML tasks. By decoupling data-based applications from the mechanics of managing data quality, the project will make it easier for practitioners to detect and mitigate biases stemming from data throughout their workflows.
Outcomes and Updates
-
Example-based Explanations for Random Forests using Machine Unlearning
-
Researchers Investigating Popular Predictor Tool, Working to Mitigate Bias in Data