Rule referencespam/template-coverage

Template Coverage — How Sparse Keyword Matrices Expose pSEO

spam/template-coverage groups URLs in the same directory, masks the entity tokens in each filename, and reports how many of the possible dimension combinations a template actually fills — surfacing, at info severity, the sparse high-dimension matrices Google's March 27, 2026 core update down-weighted on programmatic sites.

Test your site for template coverage — how sparse keyword matrices expose pseo

Loading bot check… if this doesn't resolve in a few seconds, refresh the page.

We'll surface findings tagged with `spam/template-coverage`.

What it detects

spam/template-coverage is a diagnostic, not an accusation. It groups your URLs into clusters by parent directory, and within each cluster of at least 5 pages it looks only at the filename — the last path segment, extension stripped. It masks the entity tokens in that filename using your entity patterns, then splits the masked name on hyphens into positional tokens.

For each position where more than one distinct value appears, the rule records a 'dimension'. A cluster like /jobs/[role]-jobs-in-[city] has two dimensions: role and city. The rule multiplies the number of distinct values in each dimension to get the total possible combinations, then divides the pages you actually built by that total to produce a coverage percentage. If a template has 12 services and 50 cities — 600 possible cells — but you shipped 96 pages, coverage is 16% and the rule reports the dimensions, the sample values, and the ratio at info severity. A cluster where every token varies, or none does, produces no finding because there is no matrix to measure.

Why it matters

A sparse matrix is a behavioural confession. Filling 16% of a 600-cell grid almost always means a script generated the combinations that had search volume and skipped the rest — the definition of building pages for keywords rather than for users. A human team that genuinely served every service in every city would either cover the grid densely or never have framed the work as a grid at all.

The rule fires at info severity on purpose: sparse coverage is not inherently spam. A directory legitimately serving 96 real markets is fine; the signal only matters when the sparsity pairs with thin or near-duplicate content in the same cluster. Google's March 27, 2026 core update down-weighted exactly this shape — high-dimension templates with low fill rates — because the combinatorial ambition is a reliable marker of coverage-driven generation. Treat a coverage finding as a question: can you actually differentiate every cell you intend to fill, or are you claiming a matrix you cannot substantiate?

A page that fails

/locations/ holds 96 pages of the form [service]-in-[city]. Masking the entity tokens reveals two dimensions: 12 services and 50 cities, implying 600 possible combinations. The cluster also trips spam/near-duplicate and spam/thin-content. The coverage finding reads: '/locations has 96 pages across 2 dimensions: 12 values (e.g. plumbing, roofing, hvac) x 50 values (e.g. austin, dallas, houston). Coverage: 96 of 600 combinations (16.0%).' Read together, the picture is a template that generated the high-volume cells and left the grid mostly empty.

A page that passes

The same /locations/ cluster, narrowed to the combinations the business can actually differentiate: 12 services in the 8 cities where it has a physical branch, 96 pages covering 96 of 96 cells. Coverage is 100%. Each page carries the branch address, local pricing, and named staff for that city, so the dense grid reflects genuine market presence rather than a keyword script that filled the easy cells of a 600-cell matrix.

How to fix it

  1. 1Narrow the matrix to what you can differentiate. If you cannot write genuinely distinct content for all 600 cells, do not claim the grid — build the cells you can substantiate and drop the dimensions you cannot.
  2. 2Raise coverage by subtraction, not addition. Pruning empty intent often beats generating the missing cells, because the missing cells are usually the ones with no demand and nothing unique to say.
  3. 3Check the paired findings first. A coverage finding next to spam/thin-content or spam/near-duplicate in the same cluster is the combination that matters; coverage alone is a diagnostic to note, not an emergency.
  4. 4Collapse a dimension. If one axis (say, modifier words like cheap/best/top) adds combinations without adding user value, remove it from the URL structure and fold it into a single page.
  5. 5Treat info severity as guidance. The rule never blocks a verdict on its own — it tells you where a template's ambition outruns its substance so you can decide before Google does.

SpamBrain context

The 'keyword matrix' has been the engine of programmatic SEO since long before SpamBrain, and Google's spam policies have steadily closed in on it. The doorway-pages policy (March 16, 2015) named pages built for query permutations; the March 5, 2024 scaled-content-abuse update reframed the harm as volume without value; and the March 27, 2026 core update specifically down-weighted sparse, high-dimension templates on programmatic corpora.

spam/template-coverage (in @pseolint/core, MIT-licensed at github.com/ouranos-labs/pseolint) is the only rule in the suite that reasons about your URL structure as a combinatorial grid rather than about page content. That is why it ships at info severity and never contributes a blocker on its own — coverage is context, not a charge. Its job is to make the matrix visible so you can answer the question every scaled-content policy is really asking: did you build these pages because each cell serves a distinct need, or because a loop could generate them?

Frequently asked questions

Is low template coverage always bad?
No, and the rule reflects that by firing at info severity. A directory that genuinely serves 96 specific markets has low 'coverage' of every theoretically possible combination and is perfectly legitimate. Low coverage only becomes a problem when it pairs with thin or near-duplicate content in the same cluster — that combination is the signature of a script that filled the high-volume cells of a keyword grid and skipped the rest.
How does the rule decide what a 'dimension' is?
It strips each URL to its filename, masks your entity tokens, and splits the masked name on hyphens. Any token position that holds more than one distinct value across the cluster becomes a dimension. So /x/[a]-in-[b] has two dimensions if both [a] and [b] vary. If every position varies, or none does, there is no measurable matrix and no finding.
Why only clusters of 5 or more pages?
Below five pages there is not enough of a pattern to call something a template. The minimum keeps the rule from labelling a handful of related URLs as a generated matrix. You can tune the threshold with the templateCoverageMinPages option if your site's structure warrants it.
What is the difference between this and template-diversity?
spam/template-diversity measures how uniform your rendered HTML is; spam/template-coverage measures how completely your URL structure fills its own combinatorial grid. One looks at the pages, the other at the address space. A site can have diverse HTML but a suspiciously sparse URL matrix, or a dense matrix rendered through one rigid template — they catch different halves of the same programmatic shape.
My brewery directory has a sparse beer-style grid — is that a problem?
It depends on whether the empty cells were ever meaningful. A taproom finder crossing two hundred breweries against twelve styles implies a huge matrix, but most breweries simply do not pour a barrel-aged gose or a hazy triple IPA. An audit reporting eleven percent coverage on that beverage grid is asking whether the missing pours were demand-driven or merely unreachable. Prune to the styles each taproom actually serves — the growler fills, the seasonal lager, the nitro stout, the cask night — and a sparse ratio becomes an honest one without a single empty pint page.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.