How is this different from the thin-content rule?

spam/thin-content counts total substantive words and fires below 300; content/unique-value ignores length and scores originality as a rarity density, firing when a page's vocabulary is mostly shared with its siblings. A page can pass thin-content with 1,000 words and still fail unique-value if most of those words are boilerplate. Length and originality are different axes — this rule measures the second one.

Does useful, accurate data count toward uniqueness?

Only to the extent it is page-specific. This trips up pSEO teams constantly: a regulation, spec, or statistic that is genuinely useful but repeats across every page on the same axis is common vocabulary and barely lifts density. The metric moves on text that exists on this page and few others. The fix is not to remove the shared data but to add material that exists nowhere else on the site.

Why a density rather than a word count?

Because repetition should not be rewarded and neither should shared filler. The rule works over distinct tokens and weights each by how rare it is across the audit — a word on every page scores near zero, a page-specific one near one — then averages them. Saying 'San Francisco' fifty times still contributes one word's worth of rarity, which is the right behaviour for a rule measuring how much new information a page actually contributes.

Does the score change as I add or remove pages?

Somewhat — density is relative to the audited set, so a word's rarity drops as more pages use it. But because the score is an average over all of a page's words rather than a hard count of page-exclusive ones, adding a sibling nudges the density slightly instead of flipping a verdict on a one-word margin. It reflects how Google evaluates your site as a whole, without the instability of an absolute threshold.

I sell telescopes — every page repeats the same optics glossary. Does that count?

The glossary does not, but the instrument's own numbers do. A refractor page stating its 102-millimetre aperture, its 660-millimetre focal length, the supplied 25-millimetre eyepiece, and the dovetail mount it ships on carries vocabulary no sibling listing repeats. A computerised go-to altazimuth mount and a manual equatorial tripod differentiate two products that would otherwise read alike. Move the shared 'what is magnification' explainer to one reference URL, and each telescope's distinct aperture, focal ratio, and eyepiece kit becomes the page-unique substance the rule counts. A heritage-orchard nursery that lists the rootstock, the chill-hours requirement, and the pollination group for each apple cultivar gives every page words no sibling listing repeats.

Rule referencecontent/unique-value

Unique Value — Originality as a Density, Not a Word Count

content/unique-value scores how original each page is as a rarity density — every distinct word weighted by how rare it is across the audit, then averaged — and fires when that density falls below the floor, the page-specific-vocabulary test Google's scaled-content-abuse policy has applied since March 5, 2024 when it asks whether a URL adds anything genuinely new.

Test this rule on your site →Run a full audit

Test your site for unique value — originality as a density, not a word count

What it detects

content/unique-value asks how original a page is relative to its siblings, as a density rather than a raw count. It tokenises each page's main content — lower-cased, split on whitespace, with leading and trailing punctuation stripped so 'word', 'word.' and '(word)' count as one token — and weights every distinct word by how rare it is across the audited set: a word on one page scores 1, a word on every page scores near 0 (normalised inverse document frequency). The page's score is the average of those weights — its unique-content density, between 0 and 1.

A page whose vocabulary mostly repeats across its siblings — boilerplate, shared spec blocks, an entity-swapped template — scores low and fires. Because it is an average, the metric does not punish a page for being short or for living in a large, tightly-themed site, and it does not flip on a one-word margin the way a hard count does. Volume is spam/thin-content's job; exact twins are spam/near-duplicate's; this rule isolates low originality.

Why it matters

This is the rule that catches the failure thin-content misses. A page can clear the 300-word thin-content floor with room to spare and still be almost entirely boilerplate with an entity swapped in — long, but not original. content/unique-value measures originality directly by asking what vocabulary exists here and nowhere else on your site, which is much closer to how Google decides whether a URL earns its own slot in the index.

The most expensive mistake on programmatic sites is adding real, useful, but per-axis-shared data and expecting it to count. A regulation repeated across every page for that role, a spec block shared across a product line, a city's statutes echoed on each of that city's pages — all genuinely helpful, all shared, all worth zero toward this metric. The words that move it are the page-specific ones: a distinct lead, this record's particular facts, an example that exists only here. That is the difference between a database export and a page worth ranking.

A page that fails

/api/stripe-vs-square and /api/stripe-vs-paypal on a fintech directory. Each is 900 words, comfortably past the thin-content floor. But the shared 'What is a payment API' intro, the identical feature glossary, and the same integration checklist mean roughly 91% of each page's vocabulary also appears on its sibling. Its unique-content density lands near 9% — well under the 20% floor — so the rule fires error, because a reader gains little from the second page that the first did not already give them.

A page that passes

The same two pages, rebuilt so each leads with provider-specific material: real Stripe Radar fraud-tooling detail on one, Square's in-person hardware fees on the other, each with its own code sample and pricing edge cases. The shared glossary moves to a linked reference page. Now around 64% of each page's vocabulary is distinctive rather than echoed across siblings — its unique-content density clears the 20% floor with room to spare — and the rule passes.

How to fix it

1Write a page-specific lead. The fastest way to raise density is an opening paragraph true of this entity and nothing else. Boilerplate intros are the first thing to cut.
2Move shared blocks to a shared URL. A glossary, a methodology note, or a legal disclaimer that repeats across pages should live on one page the others link to, not embedded everywhere where it dilutes uniqueness.
3Stop expecting per-axis data to count. Content repeated across pages on the same axis — a role's regulations across that role's documents — is common vocabulary and barely moves density. Only text specific to this page raises it.
4Bind distinct records, not shared ones. If two pages pull the same fields from your data source, they will share vocabulary; differentiate the records or merge the pages.
5Read the density and overlap the finding reports. It tells you how distinctive the page is and confirms the problem is overlap, not length.

SpamBrain context

Originality has been the spine of Google's quality guidance for over a decade — the Search Quality Rater Guidelines have used 'no added value' as a Lowest-quality marker since 2014 — but the March 5, 2024 scaled-content-abuse update made it enforceable at scale by naming pages that exist 'with little unique value' as a policy violation regardless of how they were produced.

content/unique-value (in @pseolint/core, MIT-licensed at github.com/ouranos-labs/pseolint) is pseolint's most direct measure of that clause. Where spam/thin-content counts total substantive words and spam/boilerplate-ratio measures shared sentence blocks, this rule scores how rare a page's vocabulary is against every other page in the audit and averages it into a density. A page whose density falls below the floor is, by definition, contributing almost nothing the rest of the site does not already say — which is precisely the condition Google's deduplication and quality systems are built to demote.

Frequently asked questions

How is this different from the thin-content rule?: spam/thin-content counts total substantive words and fires below 300; content/unique-value ignores length and scores originality as a rarity density, firing when a page's vocabulary is mostly shared with its siblings. A page can pass thin-content with 1,000 words and still fail unique-value if most of those words are boilerplate. Length and originality are different axes — this rule measures the second one.
Does useful, accurate data count toward uniqueness?: Only to the extent it is page-specific. This trips up pSEO teams constantly: a regulation, spec, or statistic that is genuinely useful but repeats across every page on the same axis is common vocabulary and barely lifts density. The metric moves on text that exists on this page and few others. The fix is not to remove the shared data but to add material that exists nowhere else on the site.
Why a density rather than a word count?: Because repetition should not be rewarded and neither should shared filler. The rule works over distinct tokens and weights each by how rare it is across the audit — a word on every page scores near zero, a page-specific one near one — then averages them. Saying 'San Francisco' fifty times still contributes one word's worth of rarity, which is the right behaviour for a rule measuring how much new information a page actually contributes.
Does the score change as I add or remove pages?: Somewhat — density is relative to the audited set, so a word's rarity drops as more pages use it. But because the score is an average over all of a page's words rather than a hard count of page-exclusive ones, adding a sibling nudges the density slightly instead of flipping a verdict on a one-word margin. It reflects how Google evaluates your site as a whole, without the instability of an absolute threshold.
I sell telescopes — every page repeats the same optics glossary. Does that count?: The glossary does not, but the instrument's own numbers do. A refractor page stating its 102-millimetre aperture, its 660-millimetre focal length, the supplied 25-millimetre eyepiece, and the dovetail mount it ships on carries vocabulary no sibling listing repeats. A computerised go-to altazimuth mount and a manual equatorial tripod differentiate two products that would otherwise read alike. Move the shared 'what is magnification' explainer to one reference URL, and each telescope's distinct aperture, focal ratio, and eyepiece kit becomes the page-unique substance the rule counts. A heritage-orchard nursery that lists the rootstock, the chill-hours requirement, and the pollination group for each apple cultivar gives every page words no sibling listing repeats.

How this shows up in practice

Dunvale Insurance Quotes hosts 5,600 pages at /quotes/{state}/{policy-type}/. Each page is assembled from a shared rate-table widget plus two sentences of templated intro copy. When pseolint tokenises the main content, lowercases every word, and strips punctuation, the page /quotes/nebraska/term-life/ carries only 61 words that appear on no other page in the audit -- below the 100-word floor content/unique-value fires an error at. The words unique to that Nebraska term-life page are almost entirely the state name and policy label repeated in headings; the body copy is identical to /quotes/iowa/term-life/ and 34 other siblings. Actuary Philippa Storrow added a state-specific paragraph citing Nebraska's 2023 NDOI consumer-complaint rate (4.7 per 1,000 policies) plus three local insurer names, lifting the unique-word count to 118.

Sources

Google Search Central — Spam policies: scaled content abuse — The March 5, 2024 scaled-content-abuse update added the criterion that individual pages must carry 'very little unique value' to qualify as spam; content/unique-value operationalises that standard by counting only words present on no sibling page in the audit, firing an error when the page-exclusive token tally falls below the 100-word floor.
Google Search Central — Creating helpful, reliable, people-first content — Helpful Content guidance asks whether a URL delivers original information beyond what neighbouring pages already say; the rule answers that mechanically — lower-casing tokens, stripping leading and trailing punctuation so 'word.', '(word)', and 'word' resolve to one token, then cross-referencing a cross-audit frequency map to isolate page-exclusive vocabulary.
Google Search Central — Spam policies: doorways — Doorway policy requires that pages differ substantively from their template siblings; a page whose page-exclusive token count falls below 100 after shared boilerplate is subtracted has, by definition, no vocabulary that is not already present across the cluster — the lexical fingerprint doorway enforcement targets most directly.
Google Search Central — Consolidate duplicate URLs (canonicalization) — Google's canonicalisation logic favours URLs with the most distinctive content when collapsing near-duplicate clusters; a page below the 100 unique-word floor lacks the page-exclusive signal vocabulary that would distinguish it from siblings, accelerating the canonicaliser's decision to suppress it in favour of a more-distinctive sibling.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.

Open the thin content scanner All rules