Published on 00/00/0000
Last updated on 00/00/0000
Published on 00/00/0000
Last updated on 00/00/0000
Share
Share
INSIGHTS
9 min read
Share
In today's world, AI is everywhere you look. From enhancing our shopping experiences and travel planning to revolutionizing customer service and transforming sectors like education and healthcare, the influence of Generative AI (GenAI) and Large Language Models (LLMs) is undeniable. However, as we enjoy the wonders of AI, we also occasionally run up against their unexpected quirks. One notable quirk is AI hallucinations.
An AI hallucination occurs when a GenAI system generates information that appears credible but is actually incorrect or fabricated. This happens when an AI system faces a query that’s outside the scope of its training. As a result, it tries to fill in the gaps so it can generate a plausible response. However, the response contains untrue information, misleading users despite its confident presentation.
GenAI application builders and innovators should familiarize themselves with the concept of AI hallucinations. By understanding their underlying causes and how their occasional appearance should impact the way we interact with AI, developers can build in adequate safeguards to protect their business and their users.
AI hallucinations are a byproduct of GenAI. GenAI applications are built on advanced machine learning models called large language models.
Machine learning (ML) is a subset of AI in which computer algorithms automatically improve their accuracy and performance through experience. By analyzing large datasets and adjusting as new data is presented, these algorithms learn patterns and can make intelligent predictions and decisions with minimal human intervention.
An LLM is an advanced ML model designed to understand, generate, and interact with human language. LLMs are trained on vast amounts of text data, allowing them to perform tasks like translation, summarization, and question-answering.
The data used to train an LLM significantly impacts its performance and capabilities. Ideally, training data has quantity, quality, and diversity. High-quality data is accurate, relevant, and comprehensive. Training data has diversity when it includes a wide range of sources, perspectives, and types. These different attributes influence how well an LLM can understand and respond to new inputs.
GenAI is the type of AI built on LLMs that can generate new content—from written text and programming code to images and audio. Presently, they are most commonly used in question-answering and information-finding applications.
LLMs are meant to operate within the confines of their training data. Even though the training data for an LLM is extensive, it is not unlimited. If an LLM is asked a question that extends beyond its training data or is ambiguous, the model may generate a plausible but incorrect or misleading answer.
Many GenAI applications built on top of LLMs lack the mechanisms to determine or admit when the user is asking them to venture outside their knowledge base. This leads to confidently presented but potentially incorrect responses.
Imagine GenAI as an over-eager intern named Alex. Alex is highly skilled at processing vast amounts of information quickly and is always eager to help. How might Alex behave when faced with unfamiliar questions or incomplete data? Wanting to impress, Alex may make an educated guess. Although Alex’s guesses are well-intentioned and may sound plausible, they are actually inaccurate, not grounded in evidence, or completely fabricated.
Similarly, an LLM application may make an educated—but inaccurate—guess based on the data that the model was trained on, and we call this guess an AI hallucination.
Just as the intern’s understanding is limited to available information and previous experiences, the LLM’s understanding is limited to its training data. And much like an over-eager intern, an LLM may make a well-intentioned guess that is wholly incorrect.
An LLM algorithm is programmed to always return a response. Whether its response is much more accurate or slightly more accurate than other possible results, the system will respond with total confidence. That self-confidence when responding—even when the response lies outside of its training is the reason this phenomenon has been termed a "hallucination."
LLMs are not programmed to say, “I don’t know.” This can be mitigated with the use of Retrieval Augmented Generation (RAG), which can provide trusted contextual information to the LLM for help in answering the question.
Not all AI hallucinations take on the form of a wild answer in a chatbot response. For example, when GenAI is used to summarize data for analytics, an AI hallucination might emerge if you ask it to forecast outcomes for which the available data was insufficient or didn’t apply.
Because LLMs are complex, LLM-based applications are often quite opaque. This means it is often not obvious when an AI hallucination has taken place. The high level of confidence displayed by an LLM as it responds further compounds the problem. For this reason, developers and end users should equip themselves to recognize and investigate AI hallucinations effectively.
AI hallucination shows up in popular LLMs available to the public and their hallucination rates vary. Here’s how an AI hallucinations might show up in a question-answering chatbot interaction:
The above AI hallucination example is from a fairly innocuous setting. However, AI hallucinations can potentially lead to serious, real-world consequences. Consider the following examples:
The use of GenAI in our everyday lives will only increase. Until the phenomenon of AI hallucinations is eliminated, a strong understanding of AI hallucinations is crucial.
People may rely on AI for decisions that can significantly impact their lives. Therefore, the trust and reliability of these AI systems are of utmost importance.
When AI hallucinations show up in sensitive fields, this can lead to dangerous outcomes. When it comes to safety, ethics, and responsibility, an understanding of AI hallucinations helps us navigate where AI technologies should (or should not) be deployed and how to safeguard those deployments.
In the context of business decisions, wariness of AI hallucinations is also vital. For example, inaccurate AI-generated data might be added to a spreadsheet. The presence of the hallucinated data would be more subtle and difficult to detect than a blatantly wrong answer from a chatbot. However, the implications could be far-reaching. Enterprises that use AI to inform financial decisions can suffer severe consequences if those decisions are based on inaccurate data. Therefore, the potential economic impact of AI hallucinations cannot be ignored.
As AI becomes a common source of information, society faces a growing potential for the spread of misinformation. This underscores the need to promote AI literacy. AI literacy is the ability to critically assess and effectively use AI, while recognizing its capabilities and limitations. With AI usage on the rise and AI hallucinations still a common issue, AI literacy will be essential to help users identify reliable AI-generated content.
How might users of GenAI applications recognize AI hallucinations? The following tips in GenAI usage will be helpful:
Tip #1: Question the plausibility
Always consider whether the response from a GenAI application seems reasonable or if it might be an extrapolation beyond available data. Never accept answers without verifying their accuracy first.
Tip #2: Verify with trusted sources
Cross-check AI-generated information with reputable sources, especially in sensitive or high-risk scenarios.
Tip #3: Look for inconsistencies
AI-generated responses that contradict known facts or previous outputs may indicate hallucinations. For example, consider the ChatGPT 3.5 interaction above. The GenAI system repeatedly contradicted its previous outputs, indicating a likely AI hallucination.
Tip #4: Understand the limitations of an AI system
Knowing the scope of your AI system—what kinds of data it has been trained on and the domain it was designed to work in—can help you gauge when it might be venturing into speculative territory.
AI hallucinations occur when LLMs and the GenAI applications built on them produce convincing but incorrect or fabricated responses. They often occur when the questions asked of these AI models venture outside the scope of their training data and experience.
The impact of AI hallucinations is real, potentially harming trust, safety, and decision-making across various sectors. It’s crucial for builders of AI systems to address these hallucinations and build in safeguards. Likewise, it’s equally vital for end users to develop AI literacy so that they can navigate and utilize AI technologies responsibly.
Navigating the challenges of GenAI innovation and AI hallucinations requires informed expertise. Outshift by Cisco is leading the way in promoting trustworthy and responsible AI, helping modern organizations understand and mitigate the risks associated with AI inaccuracies.
Outshift empowers enterprises to harness the full potential of AI technologies while ensuring they are used in a beneficial and reliable manner. For more insight into our commitment and approach, explore how Outshift is shaping the future of responsible AI deployment.
Get emerging insights on innovative technology straight to your inbox.
GenAI is full of exciting opportunities, but there are significant obstacles to overcome to fulfill AI’s full potential. Learn what those are and how to prepare.
The Shift is Outshift’s exclusive newsletter.
The latest news and updates on generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.