How do the pseolint rules map to Google's SpamBrain classifier?

The rule set clusters around the major axes SpamBrain scores against. spam/* covers the patterns that triggered the March 5, 2024 scaled-content-abuse update — thin content under 300 words, doorway clusters, near-duplicate templates with >85% lexical overlap, and templates that don't vary their structural skeleton. Each rule fires per sampled page and the results aggregate to a per-template verdict: a rule that fires on 8/10 sampled pages of the same template becomes a template-level finding, not 8 separate URL findings. content/* checks intent match, originality, and reading level. aeo/* audits answer-engine readiness for Perplexity, ChatGPT, and Google's AI Overviews. Site-type-aware weights mean a programmatic-directory is scored differently from a small-marketing site.

Why are only 5 rules written up here?

These five spam-pattern rules are the ones most likely to demote a programmatic-SEO domain in 2026, and the ones whose detection logic is least well documented elsewhere. The rest of the rule set surfaces in every audit report and is documented inside the open-source repo. More long-form explainers will land here as we observe which generate the most user questions.

Rule reference · 5 of 32 featured

SpamBrain rules — what pseolint detects.

Q: What makes a rule 'AEO-aligned'?

The 2022 SpamBrain rebuild changed enforcement — instead of waiting for a manual reviewer, the classifier silently suppresses pages it scores as spam-like at query time. An AEO-aligned rule is one whose detection logic also predicts whether AI Overviews and answer engines will cite the page, because the same signals (entity grounding, citable facts, atomic structure, schema integrity) drive both classical ranking and LLM-powered SERP extraction.

Q: Where is the full rule registry documented?

Open source at github.com/ouranos-labs/pseolint, and Google's underlying spam policies are documented at developers.google.com/search/docs/essentials/spam-policies. Every rule in pseolint links back to the specific policy paragraph it implements, so you can see exactly which Google guideline a finding maps to.

pseolint runs 47 rules across 8 categories — spam-pattern detection (8 spam/*), AEO/answer-engine readiness (8 aeo/*), graph integrity (6 links/* including host-section-divergence, the May 2024 site-reputation-abuse detector), technical SEO (9 tech/*), content quality (7 content/*), structured data (3 schema/*), data-binding consistency (2 data/*), and cannibalization (1 cannibal/*). Every rule fires per sampled page and aggregates into a per-template verdict — not a per-URL list.

These rules cover programmatic-SEO patterns + AI Overview readiness. They don't replace a general SEO audit — for Core Web Vitals use PageSpeed Insights, and for broken-link scanning use Sitebulb ($35/mo) or Screaming Frog ($259/yr).

Featured deep-dive explainers below; full taxonomy, per-template aggregation model, and SpamBrain mapping further down.

Per-template aggregation — how rules feed verdicts

The engine audits by template rather than by URL. Phase 1 detects URL templates (filter ≥1% of URLs, ≥5 URLs, ≥2 survivors after deduplication). Phase 2 samples pages stratified across templates and runs all 47rules. Each rule's output per template is summarised as a uniformity score (0–1) and a top driver — the single rule responsible for the most findings on that template. The site verdict is determined by siteVerdictFromTemplates: the worst template that covers ≥5% of the site's URLs (spec §15.1). Three aggregation patterns apply:

Per-page → template uniformity score (most spam/* and content/* rules). The rule fires on each sampled page; the fire rate becomes the template's score for that rule.
Corpus-wide (spam/near-duplicate). Computed across all sampled pages regardless of template — it surfaces cross-template duplication, not just within a single template.
Per-page → template-level signal (aeo/* rules). A rule that fires on 8/10 pages of a template reports one template-level finding, not 8 individual URL findings.

How the rules map to SpamBrain

The rule set clusters around the major axes Google's SpamBrain classifier scores against. Spam/* (8 rules) covers the patterns the March 27, 2026 core update demotes most aggressively — the most recent classifier shift to hit pSEO, tightening scaled-content signals on date-stacked corpora — building on the March 5, 2024 scaled-content-abuse update that first targeted thin content under 300 words, doorway clusters with shared boilerplate, near-duplicate templates with >85% lexical overlap, templates that don't vary their structural skeleton, and corpus-aware publication-velocity (the threshold scales with corpus size, so a 50,000-page directory and a 50-page blog get appropriate cutoffs). Content/* (4 rules) checks unique value, meta-description uniqueness after entity masking, author signals, and E-E-A-T markers. Aeo/* (8 rules, shipped April 21, 2026) audits answer-engine readiness — citable facts, atomic Q&A blocks, freshness signals, AI-crawler access, and the things Perplexity, ChatGPT, and Google's AI Overviews actually extract.

The remaining categories are links/* (6 rules — orphan pages, dead ends, cluster connectivity, link depth, unreachable-from-root, and host-section-divergence — the last one detects sub-sections that ride a host's reputation without integrating into it, which is the May 2024 site-reputation-abuse policy target), tech/* (4 rules — canonical consistency, sitemap completeness, soft-404, and redirect chains), schema/* (3 rules — JSON-LD validity, required-fields by type, and cross-page consistency), data/* (2 rules — missing-binding and identical-across-pages, fired when --data-source is set), and cannibal/* (1 rule — url-pattern; title-overlap and keyword-collision were dropped in v0.4 due to high false-positive rates).

What makes a rule "AEO-aligned"

The 2022 SpamBrain rebuild changed what enforcement looks like — instead of waiting for a manual reviewer to hit a domain with a policy action, the classifier silently suppresses pages it scores as spam-like at query time. That means the old "wait for the manual action notice" playbook is dead; you have to anticipate the scoring. An AEO-aligned rule is one whose detection logic also predicts whether AI Overviews and answer engines will cite the page — because the same signals (entity grounding, citable facts, atomic structure, schema integrity) drive both classical ranking and extraction by LLM-powered SERPs.

The full rule registry is open source at github.com/ouranos-labs/pseolint, and Google's underlying spam policies are documented at developers.google.com/search/docs/essentials/spam-policies. Every rule in pseolint links back to the specific policy paragraph it implements, so you can see exactly which Google guideline a finding maps to.

Run the full rule set on your site

The rules above are the ones most likely to fire on a templated site. The fastest way to see which ones actually fire on yours — and which template they're dragging down — is to run a free audit. No account required, results in under sixty seconds, per-template verdict included.

Each rule ships as an independent ESM module with deterministic fingerprinting, configurable thresholds via pseolint.config.ts, and a documented severity ladder (info → warning → error → critical) that maps to fixed integer penalty weights consumed by the composite-score reducer in packages/core/auditor.ts.

Provenance footnote: ruleId namespaces are stable contract from v0.4 forward; reintroduced rules retain their identifier or get a version-suffixed sibling. Suppression by classification is opt-out via --strict.

Run a free audit Try the SpamBrain checker

SpamBrain rules — what pseolint detects.

Thin Content Detection

Doorway Pages

Near-Duplicate Pages

Boilerplate Ratio

Template Diversity

Site Reputation Abuse

Entity-Swap Pages

Publication Velocity

Template Coverage

Unique Value

Meta Description Uniqueness

Missing Author

E-E-A-T Signals

Title Uniqueness

Heading Structure

Image Alt Text

Orphan Pages

Dead Ends

Link Depth

Cluster Connectivity

URL Pattern Cannibalization

Freshness Signals

llms.txt

Crawler Access

FAQ Coverage

Summary Bait

Translation No-Op

Regurgitated Content

Common Phrase Reuse

Wikipedia Paraphrase

Value-Add Score

Per-template aggregation — how rules feed verdicts

How the rules map to SpamBrain

What makes a rule "AEO-aligned"

Run the full rule set on your site