AI hallucinations: why AI gets things wrong, and how to catch it before it costs you

In 2023, a New York lawyer used ChatGPT to write a legal brief. ChatGPT cited six prior cases. The judge looked them up. Three of them didn't exist. The AI made them up, complete with case names, citations, and quoted language. The lawyer got sanctioned.

That's a hallucination. Here's what they actually are, why they happen, and how to catch them before they reach a client.

What's actually happening when AI hallucinates

Recall from earlier in this series: AI predicts what should come next based on patterns. It doesn't know things. It predicts things.

When you ask AI a question it doesn't have a reliable answer for, it doesn't say "I don't know." It predicts what an answer should look like — same grammar, same structure, same authoritative tone — and outputs that. The output looks correct because the SHAPE of the answer is correct. The content is wrong because the AI never actually had the information.

This is the core failure mode of AI in 2026. It hasn't been solved. It's been reduced, modern models hallucinate less than 2023 models, but they still do it, especially for niche, recent, or specific facts.

The five most common types

1. Made-up citations. Court cases that don't exist, papers that were never written, books with fake ISBN numbers. Common when AI is asked to back up a claim and doesn't have a source.

2. Confabulated specifics. Names, dates, numbers, addresses. The AI invents a CEO's name, a launch date, a revenue figure. Each one looks plausible. None are verified.

3. Wrong but consistent. AI commits to a wrong answer and elaborates on it. You ask "what year did X happen," it says "1998," and then writes three paragraphs assuming 1998 even though the real answer is 2003.

4. Fabricated quotes. AI quotes someone saying something they never said. The style matches their actual writing. The words are invented.

5. False capabilities. AI says "I can do X" and then can't. Common when you ask if it can connect to your CRM, send an email, or take some action it isn't actually wired up to take.

Why it happens

Three structural reasons:

1. The model was trained on the internet, which includes a lot of confidently wrong content. The AI learned that confident-sounding writing is usually correct. It applies the same confidence to its own output regardless of whether it actually knows.

2. The model has no concept of "I don't know." It always produces the most likely next word, even when the most likely word is wrong. There's no internal "wait, am I sure" check unless one is explicitly built in.

3. Some questions don't have one right answer. When asked something genuinely uncertain, AI picks the answer that sounds most plausible, but plausible isn't the same as true.

How to catch it (the four-layer defense)

Layer 1: Verify any fact that goes external.

If the AI's output is going to a client, getting published, or making a decision, verify the specific claims. Names, numbers, dates, citations, quotes. This takes 90 seconds and catches 80% of hallucinations.

Layer 2: Use AI for tasks where verification is built in.

Some tasks have natural fact-checking. Drafting an email you'll review before sending. Summarizing a document you have in front of you. Categorizing inbound messages where the categories are auditable. These are low-risk because the verification step is part of the workflow.

Layer 3: Give AI source material when accuracy matters.

Instead of asking AI to recall a fact, paste in the source and ask AI to summarize. AI is much more reliable at summarizing text in front of it than at recalling text from training. "Summarize the attached contract" is safer than "what does standard NDA language look like."

Layer 4: Build skepticism into your workflow.

When you're using AI on anything customer-facing, design the workflow so a human reviews before send. Most agent builds keep a "human in the loop" gate for exactly this reason. The agent does 90% of the work, the human does 10% of verification.

When NOT to worry about hallucinations

Three categories where the risk is low:

Drafting. AI drafts a first-pass version. You edit. You'd catch any error in editing anyway. Hallucinations get filtered out at the edit step.

Brainstorming. "Give me 10 angles for this blog post." The output isn't supposed to be facts, it's idea-fuel. You pick what's good, ignore what's not.

Summarizing source material. AI is highly reliable at summarizing text you've given it, as opposed to recalling facts from training. The hallucination rate drops to near-zero on summarization tasks with provided source.

When TO worry

Three categories where you should treat AI output as suspect by default:

Anything with citations. Legal, academic, journalism. AI hallucinates citations more than any other content type. Verify every cited source individually.

Anything with specific names, numbers, or dates. "The CEO of X is Y" or "the company was founded in Z." Verify against a primary source.

Anything customer-facing without a review step. A direct AI-to-customer message with no human gate is hallucination roulette. Build the gate.

What this means for you

Hallucinations are real but manageable. They don't make AI useless. They mean AI's output is a draft, not a final product, on anything fact-sensitive. Build review steps for anything that matters. Skip review for anything that doesn't.

The next post covers how to spot AI snake oil, vendors selling tools that hallucinate badly while claiming to have "solved" the problem.

Get started

Want a real number for your specific situation?

30-minute audit call walks through your workflows and outputs a fixed price for the 2-3 things worth automating first.

Get a free audit See all agents