An agent answers a question about your market. It reaches your post, lifts a benchmark figure off the page, and writes it into a report. Then into the next report, and the next, out to readers who will never open your page. AI agents and verifiable sources have become the same problem: the agent cannot tell whether you measured that number or carried it over from a survey you read once. The figure you produced and the figure you relayed look identical to it, so it acts on both the same way, at machine speed, with no one in the path to catch the one that was already wrong. Your page already gets picked. What matters now is whether the number on it holds up for something that will never pause to ask.
An Agent Acts on Your Number
An AI agent does not vet a source the way a careful reader does. It treats every figure on a page as load-bearing, with no way to tell a number you measured from one you relayed, or one that was true at publish from one that has since moved. It acts on the figure regardless.
For a decade your number had one consumer: a reader who could doubt it. A person lands on the post, reads a sentence, decides whether to believe it. Worst case, one reader walks away with a wrong number in their head. An agent changes the unit. It takes the number and acts on it, then repeats that across a hundred answers before anyone notices.
The agents already do this badly, and someone has measured how badly. When the Tow Center put eight AI search tools through the same citation test, the tools answered more than 60% of queries wrong, and two of them cited URLs that led to error pages more than half the time. That is the consumer your numbers now answer to: a system confident enough to act, with a documented inability to tell a live source from a dead one. Your page is one link in the chain it walks, and it reads as one more restating hop the moment it relays a figure instead of originating it.
The Demand Side of the Citation Question
Almost everything written about content and AI sits on one side of a single question: how do you get cited. The whole discipline of answer engine optimization exists to make an answer engine pick your page and name you as the source. That is the supply side, and it is worth working on.
There is a second side, and almost nobody works on it. What the engine actually consumes, and whether the number it just took is safe to act on, is the demand side. An agent cannot tell whether you ran the survey or read it once in someone else's deck. That gap, not which page gets the credit, is the half of the citation question nobody has examined.
Picture the two sides on the same number. On the supply side you win: the engine names you, the traffic ticks up, the dashboard looks healthy. On the demand side that same engine has already taken the figure and moved on, and whether the number was yours to give never entered the calculation. The credit lands on a dashboard you check. The consequence plays out in answers that never link back to you. A page can take the citation win and lose on the same number, and never see the second half happen.
The two come apart on the words cite and act. Getting cited wins attention. Whether an agent can safely act on what it took is a question of consequence, and consequence scales at machine speed. For years I treated the citation as the finish line; the demand side is the half I never thought to check. So the question worth sitting with is not only whether you can trust the AI citations you read, but whether the citations on your own page would survive an agent that walks them.
Why Verifiable Sources for AI Agents Are Scarce
The gap between AI agents and verifiable sources is measurable, so we measured it. Walk the citation behind a borrowed statistic, follow it to the page it points at, and check whether that page actually states the figure. Do that across a real corpus and the destinations stop being origins.
Across 45 SaaS blogs we traced 4,907 borrowed claims, every figure a publisher had carried from somewhere else, and walked each citation to its end. About seven in ten led nowhere a machine could check.
These are traced blog citations, not AI citations: human-authored pages citing other human-authored pages, the substrate the agents now read on top of. The pattern is the same one the State of Content Decay documents at book length, and we traced thousands of citations to get there. A number that looks sourced, because a name or a link sits beside it, is usually not sourced in any way a machine can confirm.
Walk enough of these and the dead ends sort into a few honest kinds. Some point to an aggregator that re-cited the figure from a source it never names, so the trail stops at a page that is itself only relaying. Some point to a primary source that has since moved the number or taken the page down, so the figure floats free of anything that still states it. Some point to a page that never carried the claim at all. Only a thin slice resolves the way a reader assumes they all do: a page that measured the thing and still says so. That slice is the scarce one, and it is the one worth being inside.
Verifiable sources for AI are scarce for a structural reason. The open web was built for humans who rarely click through, so a citation only had to look credible to someone skimming past it. Nobody graded whether the link resolved, because almost nobody followed it. That slack stayed invisible as long as the only reader was a person. An agent removes it by following every link, every time.
Ask what an agent can actually tell about one of your own numbers. The answer is less than you would like.
An agent reads the number exactly. What it cannot read is whether the number ends with you or passes through you, and that difference is the one that decides whether acting on it is safe. When the number only passes through you, there is nothing behind your page for the agent to check, and it acts anyway.
The Misattribution Problem Agents Inherit
You would expect the broken citations to be inventions, links to pages that never existed, the kind of fabrication people now associate with AI. That is not what the trace found.
When we verified each link instead of trusting the one that sat nearest the number, about 35% of the citations that looked linked pointed somewhere other than the claim beside them: a navigation link, a footer, a different paragraph's reference. I walked a lot of these by hand, and the example I expected least kept turning up: a store locator parked next to a statistic it had nothing to do with. The link was real. It just was not the source. Of every link the trace dropped, only two were invented anywhere in the corpus.
The rot predates AI entirely. Decorative citation is something the human-authored web has carried all along, and it reads as sourced to anything that judges a citation by what sits nearest it. A person skims past the mismatch. An agent treats the nearby link as provenance, follows it, lands on a page that never made the claim, and acts as if it did.
Multiply that across a substrate where a third of the linked citations point somewhere other than their claim. An agent walks thousands of them and inherits a misattribution rate the original authors never had to answer for, because no one ever followed the links to find out. The error was always there. The agent is the first reader thorough enough to act on it.
Why Agent-Optimization Tactics Miss the Number
The instinct now is to reach for tactics: structured data, cleaner formatting, an answer block in the first 50 words, the same moves that worked when the goal was getting picked by search. None of them touch the number.
The information-agent era did not create a fact-checking problem. It exposed one publishing had been hiding behind blue links nobody clicked. Humans rarely walked the citation chain, so a borrowed figure with a plausible name beside it was as good as a sourced one, and the borrowed-stat economy ran on that gap for years. An agent walks every link. The bill comes due.
Formatting changes how a page presents a number. It cannot change whether the number is measured, current, or borrowed, and that is the only thing that decides what happens when an agent acts on it. Content provenance for AI agents lives in where the figure actually came from. Mark up a relayed number perfectly and it stays a relayed number. The one move that changes the outcome is to own the number, to be the page where the chain stops instead of the page it passes through.
Stronger models do not fix this by trying harder. Even the most capable deep-research agents hold link validity above 94% while their factual accuracy runs between 39% and 77%: the link resolves, the figure behind it has moved on. More retrieval cannot fix a substrate that does not resolve.
What Verifiable Sources Mean to an Agent
So what does verifiable mean to an agent? A number whose origin it can resolve. That resolution is how it decides what to trust at all.
That is what a claim is: a verifiable assertion tied to a source, the single figure pulled out of your prose and bound to where it came from. Treat the numbers that matter as claims in a verifiable claim layer, and each one stops being a string an agent reads off the page. It becomes an assertion the agent can place. Citation Provenance does the placing. It walks the source behind a number, hop by hop, until it reaches a terminus, and records what that terminus is: a primary source you can stand behind, an aggregator that dead-ends, or a link that no longer states the figure at all. You can only guess at another site's chain from the outside. The chain behind your own numbers is one you can read to the end.
Verifiable sources for AI come down to that resolution running clean. A number an agent can follow to a page that still states it. That is the entire difference between a figure it can trust and one it only repeats.
A terminus is only true the day you check it. Sources move, get revised, get taken down, and a number that resolved cleanly at publish can stop resolving months later. This is the work of content maintenance infrastructure: Monitored Pages watch the external source behind a claim, and when it moves, the claim that cited it is flagged before an agent acts on the stale version.
You do not need any of that to see the shape of the problem on one of your own numbers. When I go through one of our posts, I look for the number I would least want to be wrong, then open the page it came from. Take one statistic you publish and the page behind it, and check whether that page still says what you cited it for.
A Stale Terminus Propagates Your Mistake
There is a number on your site right now that you are proudest of, the one you measured yourself and have cited in talks. It is also the most dangerous number you publish. It is the one an agent trusts most, repeats most, and carries furthest, because it resolves cleanly to you.
The day the reality underneath it moves and the page does not, that clean resolution becomes a liability. The agent has no way to know the figure went stale. It keeps acting on the version it took, fanning your old number across answers you will never see, with your name attached to whatever it now gets wrong. Keeping a number true after publish stops being housekeeping you get to a quarter late. It becomes the line between being a source a machine can trust and being the one that taught it something false at scale.
Stale arrives in three shapes, and an agent sees none of them. The source revises the number upward and your page keeps the old one. The source takes its page down and your citation points at nothing. Or the figure changes underneath a URL that still resolves, so even a link check passes while the number behind it has moved.
In the information-agent era, the number you stopped checking is the one already working against you, at a speed and a reach you cannot see.