On your most important post, how many of its key numbers did you measure yourself?

About half, the rest are borrowed

Mostly borrowed from other sources

None, every number is cited from elsewhere

How to Get Cited by AI Search (Own the Number)

You ran the standard playbook for how to get cited by AI search. FAQ schema, a tight answer sentence under each heading, a round of brand mentions. Then the AI Overview shipped its answer with someone else's page under it. Every item on that playbook adjusts how your page reads once an engine finds it. Which page the engine names was decided by something none of those items touch.

Ahrefs traced ChatGPT's 1,000 most-cited pages and 67% of them fell into categories you cannot format your way into: Wikipedia, brand homepages, educational domains, app stores. Whatever room remains under that ceiling goes to pages with one property in common. A number lives there that lives nowhere else.

We followed 1,006 blog citations back along their reference trails. Only 17.2% ever reached a primary source. I kept returning to that figure while writing this, because it means the most valuable spot in any citation chain, the page a number actually comes from, stands empty in the overwhelming majority of them. What follows is ranked by how much each move changes where you sit, biggest first.

Why AI Search Cites the Page It Cites

An answer engine credits a claim to the page where its reference trail ends. Schema helps the engine parse you. A clean answer sentence gives it something to lift. An expert byline gives it a reason to trust what it lifted. All three improve a page the trail already touches, and the trail was routed before any of them applied.

That is the hole in the checklist genre. It perfects the wrapper while the engine tracks the number inside the wrapper. The slots the checklists chase are the occupied ones, reference sites and homepages with years of accumulated gravity behind them. The line for the slot where a number begins is close to empty.

What Getting Cited by AI Search Means

When an engine cites you for a statistic, it has resolved that statistic to your page as its origin. Picture the walk. Behind every figure in a blog post runs a chain: the post cites a roundup, the roundup cites a report, the report cites the survey that produced the number. The engine follows that chain and names the page where it stops.

The chain gives your page one of two roles. A figure you measured yourself carries no outbound link, because nothing sits upstream of a measurement, so the chain stops with you. Call that page a terminus. A figure you borrowed points onward, the engine takes the extra step, and the credit settles on whoever you cited. That page is a hop. Across the corpus we traced, the median chain ran about 1.20 hops deep. A walk that short almost never gets back to whoever first did the measuring, which is how the origin position stays open.

Several threads branch off here, and each has its own home: why an engine so often stops one step short of the true source, the provenance study these figures come from, the wider discipline of answer engine optimization, and the downstream cost when an agent acts on a figure nobody can verify. For the work ahead, one sentence covers it: every move below exists to put your page at the end of the walk.

Own One First-Party Number Per Page

Of the claims in our traced corpus, 65.5% rode on borrowed numbers, measured somewhere else and relayed through the page. Odds are your best post does the same. Each borrowed figure holds the page in the hop role for that claim, one link away from where the credit lands. Measure the figure yourself and the outbound step disappears, leaving the chain nowhere to go.

That is the highest-leverage edit available to you: pick one borrowed figure on one page that matters and replace it with a number you produced. A poll answered by your own readers is original measurement. So is a chart cut from data already sitting in your systems.

Everything rides on picking the right figure. A post carries maybe a dozen statistics, and most of them are trim. When I audit one of ours, I look for the number that would survive a summary: the one in the headline, the one a reader repeats in Slack, the one another blog grabs when it links to us. That is the claim an engine resolves when it decides who to name. Originating a filler percentage buried in paragraph nine buys you nothing.

Where does the replacement number come from? Readers first, because they are the cheapest instrument you own. A poll on the exact question your post answers hands you a figure with a sample size you can state and a date you set. After that, mine what the business already collects: support tickets, usage logs, last quarter's customer survey. No research budget appears anywhere on that list, and every item on it produces a number whose chain starts on your page.

Run the count on yourself before you audit anyone else: on your most important post, how many of the load-bearing numbers came out of your own measurement?

However the tally lands, it reads the same way: each borrowed figure routes the chain through your page toward its real source, and each measured one ends the chain where you published it.

Living Content

The ratio of measured to borrowed numbers on a page decides its chain position before a single tactic is applied, and it is a ratio almost no one has stopped to total. The count is the diagnostic. A page built entirely on borrowed figures is a hop by construction, no matter how clean the formatting, because the chain always runs through it toward whoever it cited.

You can only convert a borrowed number once you know it's borrowed, and no publishing workflow labels that for you. The claim layer reads each statistic's chain position, so a figure you measured shows up as a terminus and a borrowed one as a pass-through headed for someone else's page. With the split visible, the job reduces to a single act: find the borrowed number your page depends on hardest and take ownership of it.

Cite the Primary Source Directly

Some figures will stay borrowed forever. You did not run the labor survey or the industry benchmark, and pretending you did would be fabrication. For these, the play is chain length: sit one hop from the origin instead of adding a link to the trail.

The mechanic takes ten seconds and gets skipped constantly. When you quote a statistic, link the organization that produced it, never the blog post where you happened to read it. Your page then sits directly against the origin, and any engine walking through you reaches the real source in one step.

The 17.2% figure earlier in this piece resolves the same way: its public claim page walks hop by hop back to where the number was measured. Citation Provenance shows that hop-depth reading for your own pages, so a clean one-hop citation and an accidentally lengthened chain look different at a glance.

One check belongs in front of every borrowed figure: a link can load perfectly while the number behind it has already been revised. The full discipline of tracing a claim to its primary source runs deeper than one section can.

Make the Number Reachable

An engine that cannot fetch and cleanly read your number cannot end a chain on it. Reachability sits this far down the list because it protects a position you already hold. Until a measured number exists on the page, there is nothing for it to protect.

Two findings set the priorities. 88% of the URLs ChatGPT cites come straight from search, so a page that has slipped out of ordinary search results has left the citation pool with it. And the median cited page was about 1.3 years old, so a fresh publish date does little on its own to attract a citation.

The work itself is unglamorous. Put the figure in one self-contained sentence that survives being lifted out of context. Give the page a URL a person could read aloud. Keep the post indexed and findable through plain search. That same study found plainly worded URLs cited more often than opaque ones, which means even your slug participates in whether the chain can complete.

Failure runs in both directions. Roughly one in five external links in the corpus we traced were dead, gated, or broken. A chain pointing at your number through a failed link stops early, and the credit stops with it.

The ceiling deserves a plain statement too. You will never out-reference Wikipedia, and two-thirds of the most-cited pages live in categories closed to you regardless of craft. Reachability spends best in service of the one position still winnable, the page where a number begins.

Keep the Number Current

A measured number starts aging the day it goes live. Your churn rate moves, your survey cohort shifts, your usage data drifts, and the sentence on the page holds still.

Here is the specific way it goes wrong. In March you measured 34% and published it. An engine found the page, ended the chain there, and started answering questions in your niche with 34%. By October your own re-run reads 41%, and the engine keeps serving March with your name attached. Because you own the origin, you also own the error, which makes an unwatched terminus more costly than any hop.

So ownership comes with a maintenance clause: instrument the figure, so you hear about drift before an engine reads the stale value. What watching a chain looks like after publication, and what to do when the ground shifts under a number you originated, fills its own post.

The Origin Slot Is Still Open

Roughly five in six of the chains we traced never arrived at a primary source. That means the origin position sits unclaimed behind almost every claim in your niche, waiting for whoever measures first. Tracing one page by hand costs you an afternoon. The Citation Scanner runs the same trace across an entire library.

One instruction covers all of it: put one measured number on each page that matters, the number doing the heaviest work, and keep it true after you publish. Schema, source hygiene, reachability, monitoring, all of it exists to serve that decision.

Right now your most important post rests on a borrowed figure. Measure it yourself and the chain ends with you. Leave it borrowed and every citation it earns flows to the page you linked.

How to Get Cited by AI Search (Own the Number)

Why AI Search Cites the Page It Cites

What Getting Cited by AI Search Means

Own One First-Party Number Per Page

Cite the Primary Source Directly

Make the Number Reachable

Keep the Number Current

The Origin Slot Is Still Open

Check a Citation Before You Publish

Supporting Data & Claims

Polls

Claims

Table of Contents

Poll

Related Posts

How Content Experiments Work (From Hypothesis to Verdict)

AI Citation Share (Why You Cannot Optimize It Directly)

AI Agents Act on Sources They Cannot Verify