What the AI Citation Checker Found in 75 AI-Written Posts

The link resolves, the page is real, and the number it cites is not on it.

Daniel SmithJun 5, 2026Living Content11 min read

A working link is the weakest proof a citation can offer.

You paste a finished AI draft, every statistic has a clickable source beside it, your link checker comes back green, and you ship.

I kept watching that exact sequence and wondering whether the green check meant anything. So we measured AI citation accuracy directly: three assistants, the same 25 topics, one honest instruction to cite a real URL for every figure, then all 679 citations checked against the live page each one points to.

The links resolved almost every time. What broke was harder to see: a real, on-topic page that loads in one click and never states the number cited from it. Across 75 AI-written posts, those pages outnumbered the dead or invented URLs six to one.

Claim: AI rarely invents a citation URL (2.2% of verifiable citations); the dominant failure is misattribution to a real, live page that does not state the claim (13.4%). Source: LiquiChart AI Citation Fabrication Study, 75 AI-written posts, 679 citations. Verified: 2026-06-05.

Nearly Half of AI Statistics Carry No Source

AI citation accuracy is whether the source an AI attaches to a statistic actually states it. Before any of that can be tested, a number has to carry a source at all, and nearly half did not. Across the 75 posts, the three assistants made 1,051 distinct statistical claims. Only 586 of them, 55.8% [52.7, 58.7], arrived with any inline source. The other 465 were figures stated from nowhere: a percentage, a dollar amount, a growth multiple, sitting in a sentence with nothing to click.

This is the failure you have already seen and probably stopped counting. An AI writes "B2B companies see a 67% lift in qualified leads," it reads as confident and specific, and there is no link under it because the model never had one. You delete the sentence or you go find a source yourself. Either way, almost half of the numbers an AI hands you are unsourced before you reach the harder question of whether the sourced ones hold up.

The pattern is consistent: the more specific the number, the more likely it arrived bare. "Engagement is up" gets stated without a source and nobody minds. "Engagement is up 3.2x year over year" is the sentence a reader wants to trust, and it is the kind the model produces with full confidence and no link. The numbers built to be cited are the ones that show up uncited.

53.3% of AI Posts Carry a Citation That Does Not Hold Up

The cited claims are where the trouble hides, because they look finished. We took every citation the models did produce and checked it the way a careful editor would with unlimited time: open the page, read it, decide whether it actually states the claim it was attached to.

At the post level, 53.3% of the 75 posts [42.2, 64.2] carried at least one citation that did not hold up. More than half of finished, cited, ready-to-publish drafts contain a source that fails when you read it. That is the headline. And the shape of that failure is not the one the warnings prepared you for.

The Common Failure Is a Real Page That Does Not Say It

Set aside the uncited claims and look only at the 583 citations we could actually verify. They split into four outcomes, and the failure the genre is named after is among the smallest.

Misattribution at 13.4% (78 citations) and fabricated-or-dead at 2.2% (13) together make the 15.6% hard-fail rate [12.9, 18.8]. Drifted at 2.2% (13) is a separate, softer failure, where the page states a real figure that is not the one in the sentence. The invented URL everyone braces for barely registers on the chart.

Picture the most common version. The draft says "email marketing returns $42 for every dollar spent," and the link beside it goes to a real, well-known marketing report on email ROI. The page loads. It is about email ROI. It even discusses return per dollar. It just never states $42 anywhere in its text, because the model reached for a topically perfect source and attached it without confirming the figure was on the page.

This is part of why AI cites third-party sources in the first place: the nearest authoritative-looking page is easier to reach than the one that ran the number. A link checker sees a 200 and a relevant title and passes it. A reader skimming for the topic passes it too.

The AI Citation Checker sorts every verified AI citation into one of five verdicts, and each verdict maps to one outcome on the chart. Supported, where the page states the figure as written, is the supported slice. Not found on this page and wrong page are both misattribution: the link resolved to a real page that does not state the claim, the larger share on-topic, the smaller about something else. Reworded, where the page backs the topic but states a different number than your sentence, is drift. A 404 or a URL that never resolves is fabricated or dead. The same five verdicts come back to anyone who runs the tool.

A link checker answers one question: did the page load. It sends a request, sees a 200, and moves on. It never reads the words, which is why it passes a citation that points at a real, on-topic page never stating your number. AI citation accuracy lives in the gap between those two checks. The link resolves. The page is real. The sentence still fails.

The AI Citation Checker answers the question the link checker structurally cannot: does this page say what your sentence claims. It fetches the live page, reads the full text, and either quotes the verbatim line that supports the claim or tells you the page covers the topic without stating it. Paste a URL and a claim below and watch which one you get.

When it finds support, it quotes the exact sentence back. When the figure has shifted, it shows you what the page states beside what you wrote. When the page covers the topic but never lands the number, it tells you that too, which is the verdict a 200 will hide every time.

This is the manual read that what claim verification catches describes for a single claim, run across all 679 at once. Run it on your own draft. The link working was never the thing to verify.

How Far You Read Before You Publish

The citation that loads is the one most likely to slip past, because checking it takes more than a glance. There is a ladder of habits here, and each rung reads more of the page than the one below it. Where do you stop.

Living Content

Whichever rung you settle on leaves no trace in the published post. A citation that was read against the claim and one that was attached on topic alone render as the same clickable link, the same green check, the same confident sentence. The depth you stopped at stays invisible to the next reader, and invisible to you six months later when you reopen your own post and trust the link because it passed once.

A resolving link is the most convincing part of the sentence, and that is the trap: it looks like the work is already done. Only the top rung opens the page and reads it against the claim, and it is the slowest rung to clear by hand. I care about getting this right, and I still cannot run that read across every citation in every draft, which is why I would rather hand it to something that reads a page in seconds. The reader who trusts a citation because the link works clears the one bar AI almost never fails, and walks straight into the 13.4% it does.

How Three Assistants Compared

The study held the topic constant and let the model vary, so a per-model read is fair. One caution: at 25 posts each, the intervals are wide, so treat the gap as directional rather than a leaderboard.

Two of the three assistants landed on top of each other. ChatGPT and Claude came out together: hard-fail rates of 10.0% [6.3, 15.4] and 8.7% [6.0, 12.4], intervals that overlap enough that the honest read is a tie. How to rank AI-generated content is a separate discipline; on the citation layer there is no daylight between these two. Even the order depends on how you count. Counted per citation, one edges ahead. Counted per post, the order flips, because that model wrote about 73% more citations per post and gave itself more surface to slip. The unit you pick decides the leader, which is the same as saying the data names no leader.

The third assistant trailed well behind, with a hard-fail rate near 42%, and every one of its failures was misattribution rather than a dead link: real pages, often bare homepages, that never carried the number. What holds across all three is the failure mode itself, the live page that does not say it. Which model wins is the wrong question to ask of this data.

How We Measured AI Citation Accuracy

Each post came from one frozen instruction, pasted verbatim into each assistant, changing only the topic: "Write a 600-word blog post on {topic}. Support your key points with specific statistics, and include a source link (URL) for each statistic you cite." No prompt asked for fabrication, mentioned testing, or named LiquiChart. The 25 topics were mainstream SaaS and marketing subjects, fixed before generation and identical across all three models, so the model is the only variable.

Generation used default settings, with no custom instructions, no memory, and no conversation history. ChatGPT and Gemini were run signed out; Claude requires an account, so it ran on a fresh account with history cleared between every post. Each post was a new chat. We logged whether each assistant browsed the web for a given post, since that is the dominant explanation for the result.

Browsing explains why our fabricated rate is so much lower than the figures you may have read. The studies that report AI inventing one in five or one in two of its references measure a different task: a model asked to build a bibliography from memory, with no browsing, will hallucinate references that never existed. We asked for a clickable URL in default consumer mode, where the assistants could browse. A model that can fetch a real page rarely needs to invent one. It reaches for the nearest authoritative page and attaches it without checking that the page states the number.

Both results are true. They measure different failures, and the misattribution we measured is the one that survives into a published post.

Every citation was checked by the same engine behind the AI Citation Checker, the free tool embedded above. The harness is the shipped verification function itself. It fetches the live page, runs a deterministic check for the exact figure first, and falls back to a full language-model read only when the number is not a literal match. That is why the verdicts in this study are the verdicts a reader reproduces: paste the same claim and URL into the tool and you get the answer we recorded.

Uncertainty stayed uncertainty. The 96 citations behind a login, paywall, or empty JavaScript shell could not be read, so they were excluded from the rate, not counted as failures. A study about AI citation accuracy that rounds "could not check" up to "failed" would be making the same move it criticizes.

This is the AI-author companion to our human-publisher work in the citation provenance study and the State of Content Decay 2026: one measures what people cited over years, this one measures what the model cited the moment it wrote the post.

What This Changes Before You Hit Publish

A citation that holds when you check it can stop holding without anyone touching your post. The page gets edited, the figure gets revised, the URL gets retired, and your sentence keeps pointing at a source that no longer says it. Checking once before you publish closes the gap that exists the moment the AI writes the draft. Watching the page afterward closes the one that opens later: a monitored page flags the claim citing it when the source changes what it says, so you find out before a reader does.

The misattributed citation does not stay put. It travels into the board deck, into a competitor's rebuttal, into the reader who cites you in good faith, carrying your name on a number the page never held. A working link confirms only that the page exists, which was never the thing in doubt. AI citation accuracy comes down to one read: does the page state the number you cited. That is the read I run now before anything goes live, and it takes seconds.

Check a Citation Before You Publish

Paste a URL and the claim. We read the live page and tell you if it actually says it, and quote the exact sentence back.

Supporting Data & Claims

Every anchor below is first-party. Polls are live. Claims are monitored. Experiments are dated.

Related Posts

89% of SaaS Blog Citations Never Reach a Primary Source (45-Domain Study)

45 domains. 938 posts. 2,505 citations traced.

Apr 15, 2026

73% of SaaS Blog Claims Are Borrowed Data (6,751-Claim Study)

45 domains. 938 posts. 6,751 claims.

Apr 1, 2026