The New Dawn of AI Evaluation: NIST's ARIA
This week marks a significant milestone in the journey of artificial intelligence evaluation. The National Institute of Standards and Technology (NIST) launched ARIA — Assessing Risks and Impacts of AI. This program promises to revolutionize how we understand and deploy AI technologies by focusing on sociotechnical testing, an approach designed to evaluate AI systems in real-world scenarios.
Historically, AI systems have been tested within the controlled confines of laboratories, where accuracy and bias are measured against known ground truths. This method, while valuable, provides a limited perspective. It tells us how well a model performs under ideal conditions but fails to capture the complexities of real-world applications. ARIA changes this by allowing external researchers to test models as they are used in the wild, offering a more comprehensive view of their impact.
What makes ARIA truly groundbreaking is its interdisciplinary nature. This initiative transcends computer science, delving into sociology, psychology, and anthropology. The focus shifts from the formal characteristics of a system to its human impact. This shift is crucial because the deployment context can drastically alter an AI's effects. An unbiased model might cause harm if misapplied, while a flawed model might yield benefits under the right conditions.
ARIA’s emphasis on human utilization aligns perfectly with the mission of the Center for Advancing Safety of Machine Intelligence (CASMI). At CASMI, we understand the importance of data integrity and robust algorithmic training. However, we also recognize that successful AI deployment hinges on user interaction and situational context. We’ve supported NIST by providing insights from our workshops and ensuring that the right researchers are involved — those who look beyond the silicon and consider human impact.
Our principal investigators are unwavering in their dedication to the human aspects of AI. They understand that a sociotechnical approach is essential to grasp how AI models will be utilized and their subsequent effects on society. This involves addressing potential harms, ensuring safety, and promoting ethical and responsible behavior.
The driving force behind this transformative approach is NIST Research Scientist Reva Schwartz. Her vision underscores the importance of considering the human condition alongside machine performance. It's a testament to her leadership that ARIA is not just about evaluating machines but understanding their real-world implications on people.
ARIA represents a new era of AI evaluation — one that acknowledges the intricacies of human-machine interactions and the profound impacts these technologies can have. By adopting a sociotechnical perspective, NIST is paving the way for more responsible and beneficial AI deployments. This initiative will undoubtedly shape the future of AI, ensuring it serves humanity in the most positive and ethical ways possible.
Kristian Hammond
Bill and Cathy Osborn Professor of Computer Science
Director of the Center for Advancing Safety of Machine Intelligence (CASMI)
Director of the Master of Science in Artificial Intelligence (MSAI) Program