How AI Hallucinations Happen: Causes, Detection, and Mitigation

The Model That Passed the Bar Exam Also Cited Cases That Do Not Exist

In May 2023, attorneys Roberto Mata and Steven Schwartz filed a legal brief in a U.S. federal court that cited six cases they had verified with ChatGPT — cases that turned out to be entirely fabricated by the AI. The attorneys faced sanctions, and the case became a global cautionary tale about AI hallucinations. The same technology that scored in the 90th percentile on the bar exam invented authoritative-sounding case citations with complete confidence and no disclaimer. This is not a bug that will be fully fixed in the next version — hallucination is an emergent property of how large language models are built, and understanding why it happens is prerequisite to using AI responsibly.

The Technical Root Cause: Prediction Without Verification

Large language models are trained to predict the next token given the preceding context. They do not look up information in a database during inference. They do not verify claims against ground truth. They have no concept of what they know versus what they do not know — they generate statistically plausible continuations of text based on patterns learned from training data.

When a model is asked about a topic it has limited training data on, it does not respond with uncertainty proportional to its actual information deficit. Instead, it generates text that looks like accurate information in that domain — confidently, fluently, and incorrectly. The model cannot distinguish between retrieving a fact it actually trained on versus generating a plausible-sounding fabrication, because both involve the same token prediction mechanism.

Types of Hallucinations

Type	Description	Example
Factual confabulation	Model states false facts with confidence	Citing a court case with a real name but fabricated ruling
Entity confusion	Mixing up real entities with similar names	Attributing quotes from one scientist to another with a similar name
Temporal confusion	Wrong dates, conflating events from different periods	Incorrectly stating when a law was passed or when a person was born
Logical hallucination	Correct facts but invalid reasoning leading to false conclusions	Correct medical facts assembled into incorrect treatment recommendation
Context neglect	Ignoring information provided in the prompt and generating from training data instead	Summarizing a document but reporting information not in the document
Instruction hallucination	Claiming to have followed an instruction it did not actually follow	Claiming to have searched the internet when the model has no internet access

Why Hallucination Rates Vary by Task

Hallucination is not uniformly distributed across task types. Models hallucinate much more frequently on some tasks than others, for systematic reasons.

Low hallucination risk: Tasks well-represented in training data, with verifiable correct answers and clear patterns — coding in common languages, math at routine difficulty levels, summarization of documents provided in context, translation between major languages.
High hallucination risk: Specific factual claims (names, dates, citations, statistics), niche topics with sparse training data, rapidly changing information post training cutoff, tasks requiring precise numerical reasoning, and any task where the model must retrieve a specific piece of information rather than generate plausible text.

A 2023 study from Stanford HELM benchmarking found hallucination rates varied from under 5% to over 50% across different question answering tasks and model sizes. Recent reasoning models (trained with chain-of-thought reinforcement learning) show substantially lower hallucination rates on verifiable tasks — but higher verbosity and cost.

Retrieval-Augmented Generation: The Primary Mitigation

Retrieval-Augmented Generation (RAG) addresses the fundamental limitation of generation from parametric memory by grounding the model in retrieved external documents. In a RAG system.

The user's query is converted to an embedding vector
A vector database is searched for documents semantically similar to the query
The retrieved documents are included in the model's context (the prompt)
The model is instructed to base its answer on the provided documents and cite them

RAG significantly reduces hallucination on factual questions by giving the model accurate source material to work from rather than generating from parametric memory. However, RAG does not eliminate hallucination — models can still misread or misrepresent retrieved documents, selectively ignore relevant passages, or hallucinate details not present in any retrieved document.

Other Mitigation Approaches

Technique	How It Helps	Limitation
Retrieval-Augmented Generation (RAG)	Grounds answers in retrieved source documents	Quality depends on retrieval quality; can still misrepresent sources
Self-consistency sampling	Generate multiple responses; take majority answer	Increases cost; doesn't catch systematic errors in all samples
Chain-of-thought prompting	Forces step-by-step reasoning; errors more visible	Longer, costlier; reasoning steps can themselves hallucinate
Uncertainty quantification	Models estimate their own confidence (calibration)	LLMs are poorly calibrated; expressed confidence often unreliable
Fact-checking pipelines	Automated systems verify claims against knowledge bases	Knowledge bases incomplete; verification itself can be unreliable
RLHF and constitutional AI	Training to prefer accurate, honest, qualified responses	Reduces but does not eliminate hallucination

Detecting Hallucinations in Practice

Request citations and verify them independently — do not trust that a cited source says what the model claims
For quantitative claims, check the exact figures against primary sources
Ask the model to express its confidence and identify what it is uncertain about — not perfectly reliable but useful signal
Use the model to generate a first draft, then verify critical factual claims before publication or legal/medical use
For high-stakes applications (legal, medical, financial), implement RAG with auditable source citations and human review workflows

The Trajectory: Improving but Not Solved

Hallucination rates on standardized benchmarks have declined significantly with each model generation. GPT-4 hallucinated less than GPT-3.5; Claude 3.5 and Gemini 1.5 improved further; reasoning models (o1, o3, Claude 3.7) show substantial additional improvements on verifiable reasoning tasks. But the fundamental architecture of next-token prediction without external verification means hallucination cannot be reduced to zero within the current paradigm. The research community's response involves hybrid systems combining language model generation with formal verification, retrieval augmentation, and tool use — a convergence toward architectures that verify rather than merely generate.

How AI Hallucinations Happen: Causes, Detection, and Mitigation

The Model That Passed the Bar Exam Also Cited Cases That Do Not Exist

The Technical Root Cause: Prediction Without Verification

Types of Hallucinations

Why Hallucination Rates Vary by Task

Retrieval-Augmented Generation: The Primary Mitigation

Other Mitigation Approaches

Detecting Hallucinations in Practice

The Trajectory: Improving but Not Solved

Related Articles

AI Ethics: Bias, Fairness, Accountability, and the Governance Challenge

The History of AI: From Turing's Test to ChatGPT (Part 2)

Neural Networks for Beginners: How AI Mimics the Brain (Part 5)

Generative AI Explained: How ChatGPT and Image Generators Work (Part 8)