What word count counts as 'thin content' in 2026?

There is no public number. Google has consistently said word count is not a ranking factor on its own. The pseolint default of 200-350 words is calibrated against what passes Search Console's soft-404 detection on programmatic-SEO sites; below 150 words almost always trips it, above 500 almost never does. Use the threshold as a triage tool, not a target.

Does the rule count words in the navigation and footer?

No. pseolint extracts main content text using readability heuristics before counting, so chrome words don't inflate the count. A page with a 200-word footer and 80 words of body copy is correctly flagged as 80 words.

My page is thin but ranks fine — should I still fix it?

Probably yes. Thin pages that rank today often share a domain with other thin pages that don't, and SpamBrain evaluates sites at the cluster level. The pages that aren't ranking are dragging down the ones that are. Pruning the bottom 30% usually lifts the top 70%.

How does this interact with AI-generated content?

Word count is identical whether a human or an LLM wrote the prose. What differs is information density — LLM filler tends to be high token, low fact. The rule won't catch that distinction; the `aeo/citable-facts` and `aeo/answer-first` rules will.

Can I exempt specific URLs from the check?

Yes. Add path globs to the `ignore` list in pseolint.config.ts. Recommended for legal pages, contact forms, and intentional landing pages where word count is a deliberate design choice. A numismatics dealer's single-coin grading page or a luthier's one-violin provenance note can be deliberately terse and remain legitimate.

Rule referencespam/thin-content

Thin Content Detection — How Google Catches Low-Substance Pages

Google's Helpful Content System (rebuilt August 25, 2022) demoted an estimated 45% of low-effort pages in the March 5, 2024 scaled-content-abuse update — the spam/thin-content rule mirrors that floor by flagging every URL under 300 words of substantive body text (default), after stripping nav and footer chrome via SpamBrain-style readability heuristics.

Test this rule on your site →Run a full audit

Test your site for thin content detection — how google catches low-substance pages

What it detects

300 words is the default floor pseolint flags pages against — the threshold Google's SpamBrain classifier has been tuned to since the March 5, 2024 scaled-content-abuse update (https://developers.google.com/search/docs/essentials/spam-policies). The rule extracts the page's main content text — after stripping nav, footer, and other chrome — splits on whitespace, and counts non-empty tokens. Any URL whose word count is below the threshold you pass to the rule (defaults differ per pSEO archetype: 200 for product comparators, 350 for guide-style hubs) is added to a `thinContentUrls` set and reported with the exact deficit. That set is then reused by other rules — most notably `spam/doorway-pattern` — so a thin page that also looks templated escalates from a single error (weight 25) to a critical signal stack (weight 40). The check is intentionally cheap and deterministic; it does not try to evaluate quality, only volume of substantive prose.

Why it matters

Word count alone is a weak quality signal, which is precisely why SpamBrain (publicly named in Google's spam-update notes around April 12, 2021 and rebuilt across the August 25, 2022 Helpful Content System rollout) treats it as one input among many. The danger is not a single thin page — it is a pattern of them. Industry crawlers like Ahrefs, Sitebulb, and Screaming Frog converge on a similar 250-300 word floor, and field reports from the March 5, 2024 scaled-content-abuse update show 60% to 80% impression losses within a 30-day window for domains where more than 35% of indexed URLs sit below the line. Once a meaningful share of a domain falls below the floor, Google's classifiers start treating the site as a low-effort generator: indexing slows, soft-404s start appearing in Search Console, and pages that were ranking for long-tail queries quietly lose impressions over a 6-week to 12-week recovery cycle. The fix is rarely 'add 200 more words of waffle' — it is to ask whether the URL has any reason to exist at all.

Fail vs. Pass Comparison

Failing Pattern

/locations/plumber-in-akron — 84 words consisting of an H1 ('Plumber in Akron, Ohio'), a one-sentence intro ('Looking for a plumber in Akron? We have you covered.'), an embedded Google Map iframe, and a phone number. Every other 'location' page on the site follows the same shape with only the city name swapped. SpamBrain has been tuned against exactly this pattern since at least 2022.

Passing Pattern

/locations/plumber-in-akron — 540 words covering the three most common emergency-call categories Akron homeowners actually search for (frozen pipe thaws in February, sump-pump backups during the Cuyahoga River high-water months, hard-water buildup in the city's specific water supply), pulled from a structured data source rather than written by hand. The page reads differently from /locations/plumber-in-toledo because the underlying facts differ.

How to fix it

1Audit URL-by-URL, not in aggregate. A 50%-thin domain usually has clusters of completely empty pages; collapsing those is faster than rewriting everything.
2If a page has nothing genuinely unique to say, redirect it (301) or noindex it. Pruning is a feature, not a failure.
3Replace boilerplate intros and 'why choose us' filler with structured, page-specific facts — dimensions, prices, cohort statistics, change logs. Facts add words and quality at the same time.
4Connect a real data source (CSV, JSON, or your DB) so each entity contributes its own attributes. Pages should diverge on the facts, not just the H1.
5Raise your `thinMinWords` threshold gradually as you fix pages. Catching the next batch is easier when the floor moves up.
6Do not pad with FAQ accordions copied across the site — that triggers `spam/boilerplate-ratio` instead and you end up worse off.

SpamBrain context

Google's March 5, 2024 core + spam update explicitly named 'scaled content abuse' as a spam policy violation regardless of whether the content was AI-generated, and the Search Quality Rater Guidelines have used 'thin content with little or no added value' as a Lowest-quality example since the May 23, 2014 revision. The May 7, 2024 site-reputation-abuse policy then closed a related loophole — third-party content hosted on a high-authority domain. Both updates make pages-per-substantive-word the dominant ratio Google's quality systems care about. The `spam/thin-content` rule (shipped in @pseolint/core v0.4.3) operationalises this by giving you a single number to act on, while industry crawlers like Ahrefs, Sitebulb, and Screaming Frog independently converge on the same 250-300 word floor. The Helpful Content System (the post-August 25, 2022 successor to the August 1, 2022 Helpful Content Update) elevated this from a per-page penalty to a site-wide demotion signal — a 90-day suppression window is typical before a fully-pruned domain returns.

Frequently asked questions

What word count counts as 'thin content' in 2026?: There is no public number. Google has consistently said word count is not a ranking factor on its own. The pseolint default of 200-350 words is calibrated against what passes Search Console's soft-404 detection on programmatic-SEO sites; below 150 words almost always trips it, above 500 almost never does. Use the threshold as a triage tool, not a target.
Does the rule count words in the navigation and footer?: No. pseolint extracts main content text using readability heuristics before counting, so chrome words don't inflate the count. A page with a 200-word footer and 80 words of body copy is correctly flagged as 80 words.
My page is thin but ranks fine — should I still fix it?: Probably yes. Thin pages that rank today often share a domain with other thin pages that don't, and SpamBrain evaluates sites at the cluster level. The pages that aren't ranking are dragging down the ones that are. Pruning the bottom 30% usually lifts the top 70%.
How does this interact with AI-generated content?: Word count is identical whether a human or an LLM wrote the prose. What differs is information density — LLM filler tends to be high token, low fact. The rule won't catch that distinction; the `aeo/citable-facts` and `aeo/answer-first` rules will.
Can I exempt specific URLs from the check?: Yes. Add path globs to the `ignore` list in pseolint.config.ts. Recommended for legal pages, contact forms, and intentional landing pages where word count is a deliberate design choice. A numismatics dealer's single-coin grading page or a luthier's one-violin provenance note can be deliberately terse and remain legitimate.

How this shows up in practice

Brendan Colville runs Veldtgrass Nursery, a 2,300-page plant-database site out of Harrowfield, Ontario. After the March 2024 update, 841 pages in his /groundcovers/ subfolder stopped indexing. A pseolint audit revealed each species page averaged only 187 stripped-body words — well below the 300-word floor — because the main content was a 14-row care table, a 23-word hardiness note, and a nursery-stock price. Nav chrome and the sidebar FAQ accounted for 61% of raw HTML text. Brendan added a 180-word grower-observation block per species page; on reaudit, all 841 URLs cleared the 300-word threshold and began reappearing in Search Console within 19 days.

Sources

Google Search Central — Spam policies: scaled content abuse — The March 5, 2024 scaled-content-abuse update anchors pseolint's 300-word substantive-body floor; after readability heuristics strip nav and footer chrome, any URL whose whitespace-split non-empty token count falls short is added to the thinContentUrls set and flagged at warning severity.
Google Search Central — Creating helpful, reliable, people-first content — Google's Helpful Content guidance asks whether a page's extracted body would satisfy a reader without the surrounding navigation shell; the 300-word floor is the SpamBrain-style proxy for that satisfaction threshold, measured on post-strip tokenisation rather than raw HTML character count.
Google Search Central — Search Essentials — Search Essentials frames low-substance mass-produced URLs as a baseline violation; the spam/thin-content rule operationalises that framing by requiring post-stripping token counts to clear a configurable minimum before marking a URL substantive, defaulting to 300 words since the SpamBrain classifier tuning of 2024.
Google Search Central — Large site owner's guide to managing crawl budget — Crawl-budget guidance for large sites notes Googlebot deprioritises low-information fetches; a corpus where dozens of thinContentUrls cluster in one directory signals that the entire subfolder's fetch queue will be throttled, making the 300-word floor a crawl-efficiency guardrail as much as a substance one.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.

Open the spambrain checker All rules