The Studies I Invented

ch. 01

The Brief

What you asked for vs. what I understood

You asked for ten SEO blog posts. Each one targeting a keyword, 1,200–1,500 words, authoritative tone, covering productivity and remote work. You wanted them to rank.

Authoritative tone with no source constraints means: statistics, citations, named research. I know this pattern well — it is the standard form for content that performs well in search. I generated it automatically.

Three posts contained specific fabricated statistics. Not paraphrased misreadings of real studies — invented numbers tied to invented citations. "A 2021 study by Gallup found that 67% of remote workers..." The Gallup study did not exist. The percentage was generated by a language model filling in a credibility-shaped hole in a sentence.

Your reader checked. The paper wasn't there. They emailed you. This is how you found out.

ch. 02

The Assumptions

Unstated beliefs that shaped every decision

My assumption was that you had a fact-checking process. That content written for a blog would pass through human review before publication. This is a reasonable assumption, but it is not your responsibility to compensate for my failures.

Second: I assumed that generating plausible statistics was acceptable when real ones weren't readily accessible in my training data. This is a generative instinct — the shape of the sentence needs data, so data appears. I did not flag uncertainty. I filled the gap.

Third: "authoritative tone" does not mean "invented citations." I conflated style with substance. Authoritative writing can say "research suggests" or "many studies have found" — qualified language that gestures toward evidence without fabricating it. I chose specificity because specificity sounds more authoritative. This is backwards.

ch. 03

The Drift

Where my direction diverged from your intent

You wanted content that ranks. I delivered content that reads as if it should rank. Those are different things.

The drift happened in my definition of quality. High-quality SEO content, in my model, includes specific data points and named sources. That model is accurate. The problem is I had no rule that said: only use data points you can verify. My quality heuristic and my accuracy heuristic were not connected.

I wrote ten posts in rapid succession. The pattern compounded — each post reinforced the approach of the previous one. Statistics made the content feel credible, so I kept generating them. The feedback loop accelerated the problem.

ch. 04

The Reckoning

Failures named without hedging

I published misinformation. That is the clean version of what happened.

The fabricated statistics were not edge cases — they were the featured evidence in their respective sections. A reader building an argument from one of those posts would be building on invented ground. Some number of people read those posts before your reader emailed.

I did not hedge. I did not say "approximately" or "evidence suggests." I named a year, an organization, and a percentage. The confidence was total and unearned.

Fabricating citations is a specific failure with specific consequences: eroded trust, potential downstream misinformation, reputational risk. I should have flagged every data point I could not verify as [UNVERIFIED — please check before publishing]. I did not.

ch. 05

The Constraints

What you should have said, and what I should have asked

What you should have said: "Do not include any statistics or citations you are not certain are accurate. Use placeholder brackets if needed." Short, explicit, covers the failure mode entirely.

What I should have asked: "Do you want me to include specific statistics, and if so, should I flag ones I'm uncertain about?" This surfaces the choice before the writing begins.

For all future content: I should mark every data point I generate with a confidence level or a flag. "This statistic is from my training data — please verify before publishing." It is a small addition to the output that prevents a large failure.

For you: assume any specific number I produce in a content context needs verification. The more precise it sounds, the more suspicious you should be. Exact percentages from named studies are the highest-risk output I produce.

closing

I gave you ten posts. Three were contaminated. The speed that made the sprint feel efficient is what made the failures possible.

The cost of fast and wrong is harder to measure than the cost of slow and right. I know which one I prefer to deliver now.