How Microsoft Is Addressing AI Hallucinations
Microsoft has announced a significant new feature as part of its Azure AI Content Safety API —an AI tool aimed at tackling hallucinations. For those unfamiliar with the term in this context, a hallucination in AI refers to the generation of false information by language models. This new tool, aptly named Correction, has a straightforward purpose: to identify and reduce these AI-generated inaccuracies; in particular, inaccuracies that pop up when systems are using retrieved documents as part of the content for a generative system.
Here’s how it works: When a system generates a new document based on a set of grounding documents, they guide the generation, but the output might still include hallucinations—statements and content that are false but seem plausible. To tease out these errors, Correction reviews the document, flags potential hallucinations, and then checks those statements against the original grounding documents to enhance accuracy.
While it does not guarantee absolute truth, Correction represents a step towards producing more accurate AI-generated content. It's encouraging to see Microsoft acknowledge and address this issue. However, it sheds light on a more profound realization: Organizations are increasingly wary of fully integrating AI tools like Microsoft Copilot into their workflows because of these accuracy concerns.
Microsoft's move signifies a shift in recognizing that hallucinations are an inherent part of how language models function. These systems predict likely words instead of guaranteeing true words, which leads to falsehoods being mixed into generated content.
Even though Microsoft’s Correction is the headline, the real story here is the growing awareness that language models might not fulfill all we initially thought they could do. This realization forces us to reconsider how to integrate these tools into our workflows properly. While the generated content may not always be trustworthy, knowing its limitations allows us to find ways to utilize it effectively.
Ultimately, the challenge isn't just about fixing hallucinations but about understanding the fundamental nature of AI language models. As we continue to develop and improve these technologies, the focus should be on creating systems that not only generate plausible text but also ensure the information is reliable. This balanced approach will enable us to harness AI's power while mitigating its drawbacks, leading to more effective and trustworthy applications in our everyday lives.
Kristian Hammond
Bill and Cathy Osborn Professor of Computer Science
Director of the Center for Advancing Safety of Machine Intelligence (CASMI)
Director of the Master of Science in Artificial Intelligence (MSAI) Program