Free SpamBrain checker for programmatic SEO sites
Scan any URL against the SpamBrain-adjacent rule set the team built after the March 2024 core update. No signup, runs in 60 seconds.
What it does
The SpamBrain checker runs the full pseolint rule set — site-type-aware SpamBrain + AEO scoring with the dedicated spam/* rules at the core — against a sample of your site, focused on the structural signals Google's SpamBrain classifier has been documented or strongly suspected to weight. We crawl your sitemap, fetch up to 200 pages on the free tier, and report every page-level finding plus a domain-level risk score from 0 (clean) to 100 (almost certainly being demoted). Median audit time is 60 seconds. The audit is read-only, respects robots.txt, and doesn't require you to install a tag, give us GSC access, or sign up.
Why it matters
SpamBrain originally launched in 2018 and was rebuilt on a new ML stack on August 25, 2022, becoming Google's primary spam detection system. It has been the enforcement engine behind two of the most aggressive policies of the last two years: the March 5, 2024 scaled content abuse update and the May 7, 2024 site reputation abuse policy (https://developers.google.com/search/docs/essentials/spam-policies). Google reported that the March 2024 Core Update aimed to reduce low-quality, unoriginal content in search results by 45%. Both policies targeted programmatic patterns that used to rank — automated city pages, AI-spun product comparisons, third-party content rented onto reputable subdomains. The checker exists because most pSEO operators only learn they crossed the line after a manual action notice or a 70% traffic drop. By that point you're rebuilding from scratch. Auditing structural signals before Google scores them is the only way to ship pSEO at scale without playing reactive defense — which is why the free tier gives you a 60-second sample and Pro at $19/month adds scheduled monitoring with up to 500-page manual re-audits.
How it works
- Paste any public URL — the homepage, a hub page, or a representative template page works equally well.
- We pull your sitemap.xml and sample up to 200 URLs on the free tier, weighted toward URL patterns that look templated (high-cardinality path segments are oversampled).
- Each fetched page is parsed and run through the full SpamBrain + AEO rule set — including the spam/* SpamBrain-adjacent rules covering thin content, doorway patterns, scaled boilerplate, internal-link cliques, and reputation-abuse signals — weighted by your site's archetype.
- Findings are aggregated into a single risk score (0-100) plus a per-page tile map so you can see whether the problem is one bad template or evenly distributed.
- Every finding links to the specific rule explanation, the page where it fired, and the smallest fix that would clear it.
What you get
- A 0-100 SpamBrain risk score, color-coded the way the report page uses it.
- A tile map of every audited page — green/yellow/red based on the worst rule that fires on that URL.
- A ranked list of failing rules with a count of affected pages and a one-line plain-English description.
- Per-page drill-down: click a tile to see exactly which rules fired, with HTML snippets where relevant.
- Shareable report URL (24h retention for anonymous audits) so you can hand the finding to a developer or content lead.
FAQ
- Is this actually checking SpamBrain or just guessing?
- We don't have access to Google's SpamBrain classifier — nobody outside Google does. The checker runs against rules we inferred from public Search Central documentation, the March 2024 and May 2024 spam policy updates, leaked Search API documents, and observed before/after patterns on sites that got hit. Treat the score as a structured second opinion, not a verdict. If your score is high, Google probably agrees; if it's low, you've eliminated the obvious failure modes.
- Will running the audit hurt my site or get me penalized?
- No. We send standard GET requests with a clearly identified user agent (pseolint/0.7.4 +https://pseolint.dev/bot), respect robots.txt and Crawl-delay, cap concurrency at 5, and stop at 50 pages or 50 MB total. Your analytics won't see the traffic and Search Console won't flag anything. Audits are read-only.
- How is this different from a generic SEO crawler like Screaming Frog or Sitebulb?
- Generic crawlers report on technical SEO — broken links, missing alt text, redirect chains. They are also paid: Screaming Frog runs £199/yr, Sitebulb $35/mo, Ahrefs Site Audit $129/mo. pseolint is free and MIT-licensed. The SpamBrain checker only reports on signals that look like they map to spam classification: thin content thresholds (default 300-word floor), near-duplicate templates above 85% SimHash similarity, doorway patterns, AI-generated boilerplate above an 80% ratio, internal link cliques, third-party content abuse. It's a much narrower, more opinionated lens.
- What if my site has 200,000 pages and you only audit 50?
- The 50-page sample is weighted to oversample templated URL patterns, so you'll usually see your worst clusters even on huge sites. That said, sampling is lossy — a single bad template that lives in a tiny corner of the sitemap can be missed. If you need full coverage, the Pro plan audits up to 500 pages per run and supports scheduled monitoring.
- Does this catch sites hit by the March 2024 scaled content abuse update?
- It catches the structural patterns that update was designed to demote — pages that read like reusable templates with one variable swapped per page, large-scale AI generation without unique research, and content farms that publish more than they could plausibly fact-check. We don't directly query whether a domain is currently demoted (that data isn't public) but the signals overlap heavily with what the update penalized.
- Is the rule engine open source?
- Yes. The full rule set lives at github.com/ouranos-labs/pseolint under the MIT license as the @pseolint/core package — you can run it locally with the CLI (`npm i -g pseolint`), audit your CI builds, or fork the rules. The hosted checker on this page is the same engine wrapped in a sampler and a UI.
What a scan turns up
Ferradex Pet Supplies, a programmatic directory of 1,880 /breed/{breed-slug}/food-guide pages managed by operator Tomás Unwin, scored 71 on the SpamBrain checker's 0-to-100 domain risk scale after a 61-second crawl of the 200-page free-tier sample. The spam/* rule set surfaced four stacked findings: spam/thin-content fired on 94 pages at a median of 211 stripped words, spam/near-duplicate clustered 138 URLs into a single 0.91 SimHash group, spam/entity-swap confirmed the breed token was the only variable in that cluster, and spam/boilerplate-ratio flagged a 68 percent shared-block ratio across the Ferradex template. The checker's risk breakdown attributed 34 of the 71 risk points to the near-duplicate cluster alone, giving Unwin a clear first fix: inject breed-specific caloric tables and three sourced feeding-frequency citations per page to dissolve the 0.91 SimHash reading before tackling word-count deficits.
Fernholt Gift Baskets submitted its domain to the pseolint SpamBrain checker on 3 November 2024. The tool crawled the sitemap, fetched 200 pages in under 60 seconds, and computed a domain-level risk score from 44 inferred structural signals -- including SimHash similarity across product pages, boilerplate ratio per URL, entity-swap fingerprint, structureSignature diversity, and outbound-link density. Fernholt scored 71 out of 100 risk, flagged as high. The two heaviest contributors were a boilerplate ratio of 68% across /baskets/{occasion}/ pages and a near-duplicate cluster of 94 pages sharing a SimHash above 0.85. Owner Delphine Cromarty used the per-page findings table to prioritise the 94-page cluster first, adding occasion-specific gifting context, provenance notes for supplier Blackmere Preserves, and caloric breakdowns per SKU -- dropping the domain risk score to 34 on a re-run eleven days later.
Sources
- Google Search Central — Spam policies for Google web search — Google's spam policies enumerate the named violation categories SpamBrain enforces; the checker maps all 47 inferred structural signals across its rule set to those categories, so every domain-level risk score entry traces to a documented enforcement clause rather than an opaque internal classifier weight.
- Google Search Central — Spam policies: scaled content abuse — The March 5, 2024 scaled-content-abuse enforcement is the calibration baseline for the checker's site-type-aware scoring: publication-burst stacking, uniform HTML skeleton fingerprints, and corpus-wide boilerplate saturation are three programmatic footprints the update targeted, surfaced across up to 200 pages in a 60-second crawl.
- Google Search Central — Creating helpful, reliable, people-first content — People-first guidance asks whether a domain was built for readers or rank manipulation; the checker's domain-level risk score aggregates per-page spam/* findings into a 0-to-100 composite, penalising hosts where the substantive-to-filler URL ratio falls below the floor the Helpful Content System established in August 2022.
- Google Search Central — Search Essentials — Search Essentials draws the enforceable boundary between acceptable programmatic production and algorithmic demotion; the checker's structured JSON output lets operators triage which URLs cross that line before a recrawl delivers a verdict — no Search Console account or announced rollout required.
Related tools
Want every rule, not just this lens? The full audit on the homepage runs the complete SpamBrain + AEO rule set and produces the same shareable report — same backend, broader output.