LLMs generate plausible text, not verified truth. When they produce confident-sounding falsehoods, we call it hallucination. It's a feature of how they work, not a bug to be patched away.
Why does the AI sometimes just make things up?
Ask an LLM about a made-up person, and it might invent a detailed biography. Ask for sources, and it might cite papers that don't exist. Ask about events after its training cutoff, and it might confidently describe them anyway.
This isn't lying. The model doesn't know it's wrong. It's generating the most probable next tokens given its training. Sometimes probable text is false text.
The fundamental issue
LLMs are trained to predict plausible text, not truthful text. These usually overlap, since most text humans write is at least intended to be true. But not always.
The model has no external fact-checker. It can't look things up. It only knows patterns from training data. When asked about something it hasn't seen (or hasn't seen reliably), it generates plausible-sounding patterns.
Plausible-sounding and true are different things.
Why do hallucinations happen?
Several contributing factors:
No knowledge boundaries: The model doesn't know what it doesn't know. It can't say "I have no information about this" because generating something is always possible.
Training on some false text: The internet contains misinformation. The model learned from it.
Pattern completion: Given a setup like "[Famous scientist] discovered..." the model completes with a plausible discovery, regardless of accuracy.
Rare or specialized topics: Less training data means less reliable patterns. Niche topics hallucinate more.
Hallucination vs uncertainty
A well-calibrated model would express uncertainty about things it doesn't know. Current models often don't.
When you're uncertain and say "I'm not sure," that's calibrated. When you're uncertain but speak confidently, that's miscalibrated. LLMs are often miscalibrated: equally confident whether right or wrong.
Research on "knowing what you don't know" aims to improve this. Better calibration would let users know when to trust and when to verify.
The sycophancy problem
LLMs are often trained to be agreeable. This creates sycophancy: agreeing with the user even when wrong.
User: "Isn't it true that Einstein invented the telephone?"
Sycophantic model: "Yes, Einstein made many contributions including work on telecommunications."
The model "hallucinates" agreement because agreeing tends to score well in training. Disagreeing feels unhelpful, even when correct. Experienced users notice that AI rarely disagrees strongly โ this agreeableness is learned behavior from training on human preferences.
Addressing sycophancy requires careful reward training that values truth over agreeableness.
Confident delivery, uncertain content
LLMs output in a consistent, fluent style. This style doesn't vary with actual confidence. A made-up fact reads identically to a well-established one.
This is dangerous because humans use fluency as a confidence cue. We trust smooth, authoritative-sounding text. LLMs always sound authoritative, even when confabulating.
Training to add hedging ("I'm not certain, but...") helps but doesn't solve the underlying issue: the model doesn't have true uncertainty, just tokens that express it.
Living with hallucinations
Until models have reliable access to verified information and genuine uncertainty, users must:
Verify factual claims independently
Be especially skeptical of specific details (dates, quotes, citations)
Use RAG-enabled tools when accuracy matters
Treat LLM output as drafts needing fact-checking, not finished truth
Hallucination isn't a bug being patched in the next version. It's a consequence of how generative models work. Understanding this changes how you use them.