Rule referencecannibal/url-pattern

URL Pattern Cannibalization — When Two Slugs Are the Same Words Reordered

cannibal/url-pattern splits each URL's last slug on hyphens, sorts the tokens, and flags at info severity any two pages in the same directory whose sorted token sets match exactly — the reordered-slug keyword cannibalization Google has resolved by collapsing competing URLs to one canonical result since well before its March 2026 core update.

Test your site for url pattern cannibalization — when two slugs are the same words reordered

Loading bot check… if this doesn't resolve in a few seconds, refresh the page.

We'll surface findings tagged with `cannibal/url-pattern`.

What it detects

cannibal/url-pattern looks for two URLs that are, word for word, the same page wearing a different word order. For every page it takes the final path segment — the slug after the last slash, trailing slashes removed — splits it on hyphens, drops empty tokens, and sorts what remains alphabetically. Two slugs that differ only in the order of their words produce an identical sorted token list.

The rule then compares pages pairwise, but only within the same parent directory: the path up to that last slash must match, and it must not be empty. When two distinct URLs in one directory collapse to the same sorted tokens, the rule fires once at info severity, naming both URLs and reporting that they carry the same tokens in a different order. Pages in different directories never compare against each other, and a slug with no tokens is skipped. The match is exact after sorting — not fuzzy — so it fires only when the two slugs really are the same word set reshuffled.

Why it matters

Two URLs assembled from one word set are two pages chasing a single query. A vintage-synth marketplace that ships /moog-analog-synthesizer and /analog-synthesizer-moog in the same listings directory has not built two products; it has built one product twice and asked Google to choose. The crawler usually does choose — it folds the pair to a single canonical result and splits the link equity, anchor text, and click history that should have accrued to one strong page across two weaker ones.

The damage is quiet because nothing 404s and nothing looks broken. Both pages index, both rank somewhere, and neither ranks as well as the consolidated page would. On a programmatic catalog the reorder is rarely intentional — it usually comes from a slug builder that concatenates attribute tokens in whatever order the data arrives, so /eurorack-modular-oscillator and /oscillator-eurorack-modular both get minted from the same record. The rule sits at info severity because a reordered pair is a signal to consolidate, not proof of spam, but every such pair is link equity you are dividing against yourself.

A page that fails

A vintage-synthesizer marketplace mints two listing URLs from one record: /listings/moog-modular-oscillator and /listings/oscillator-moog-modular. Both live in /listings, and after splitting each slug on hyphens and sorting, both collapse to modular-moog-oscillator — the same three tokens reshuffled. The rule fires at info: 'these URLs have the same tokens in different order'. Google indexed both, picked one as canonical 9 days after launch, and the patch-cable and CV-gate detail on the losing page now earns nothing toward the ranking page.

A page that passes

The same marketplace settles on one canonical slug order for every listing and 301-redirects the reordered twin: /listings/oscillator-moog-modular permanently points at /listings/moog-modular-oscillator. Within the /listings directory no two slugs now share a sorted token set, so the rule stays silent. The MIDI spec, the filter-cutoff range, and the modular-rack photos all consolidate onto one URL, and the page that was splitting equity with its anagram now holds the full signal for the query.

How to fix it

  1. 1Pick one canonical token order for every slug your builder emits, so the same record can never mint both /moog-analog-oscillator and /oscillator-moog-analog.
  2. 2Add a 301 redirect from the reordered twin to the canonical URL, collapsing the pair into one address before the link equity finishes splitting.
  3. 3Set a rel=canonical on any duplicate you cannot redirect, pointing every reordered variant at the single slug you want Google to rank.
  4. 4Audit the slug-generation code, not the pages — the reorder almost always comes from a builder concatenating attribute tokens in whatever order the data arrives.
  5. 5Sort or fix the token order at write time in your data pipeline, so new listings are minted in canonical order and the pair never appears again.
  6. 6Check internal links and your sitemap for both variants, and repoint every reference at the canonical slug so crawlers stop discovering the twin.

SpamBrain context

Keyword cannibalization predates any algorithm name — it is simply two of your own pages competing for the same query, a problem SEOs have written about since the early 2010s. Reordered URL slugs are one of its most mechanical forms: not a content overlap a writer introduced, but a duplicate the address space minted on its own when a slug builder shuffled the same attribute tokens.

cannibal/url-pattern (in @pseolint/core, MIT-licensed at github.com/ouranos-labs/pseolint) reasons about your URL structure rather than your page content. It does not read the HTML at all — it only asks whether two addresses in one directory are the same words in a different order. That is why it ships at info severity and never contributes a blocker on its own: a reordered pair is a consolidation opportunity, not a policy violation. Google's deduplication systems will eventually pick one canonical URL for the pair regardless, so the rule's job is to surface the split before the crawler resolves it for you and you lose a say in which slug wins.

Frequently asked questions

What exactly counts as a reordered-token match?
Two URLs match when they sit in the same parent directory and their final slugs, after splitting on hyphens and sorting alphabetically, produce the identical token list. So /gear/analog-synth-rack and /gear/rack-synth-analog match because both sort to analog-rack-synth, while /gear/analog-synth-rack and /shop/analog-synth-rack do not, because their directories differ. The comparison is exact after sorting, so a single different or extra word breaks the match and the rule stays silent.
Why is this only an info-severity finding?
Because a reordered slug pair is a signal to consolidate, not evidence of spam or manipulation. Nothing is broken — both pages still load and index — so the rule never blocks an audit verdict on its own. It surfaces the pair so you can decide which slug to keep and redirect the other, ideally before Google's deduplication picks a canonical URL for you. Treat it as a cleanup task that recovers split link equity, not as an emergency.
Will it flag two URLs that share words but in different directories?
No. The rule compares pages only within the same parent directory — the entire path up to the final slash must match, and it must not be empty. So /listings/moog-oscillator and /archive/oscillator-moog never compare against each other, even though their slugs are the same two words reordered. Directory scoping keeps the rule from flagging legitimately separate sections that happen to reuse vocabulary, and it only ever fires on genuine same-folder duplicates.
My synth marketplace auto-builds slugs from attribute tags — how do I stop reordered duplicates?
This is the classic source of the finding. A listing for a Moog modular oscillator gets a slug concatenated from its attribute tokens, but if the same instrument is re-listed and the tags arrive as oscillator, modular, moog instead of moog, modular, oscillator, your builder mints /listings/oscillator-modular-moog alongside the original /listings/moog-modular-oscillator — two URLs, one patch-cable-and-CV-gate product. Fix it at write time: sort the attribute tokens into a fixed canonical order before assembling the slug, so the polyphony, filter-cutoff, and MIDI details of a given modular rack only ever resolve to one address. In one illustrative cleanup, a dealer carrying 1,400 listings found 6% were reordered twins and recovered the split equity within 3 weeks of redirecting them.
Does redirecting the duplicate recover the lost ranking signal?
Largely, yes. A 301 redirect from the reordered twin to your canonical slug passes the accumulated link equity and consolidates the two pages' signals onto one URL, so the page that was competing with its own anagram regains the anchor text and click history it was splitting. The recovery is not instant — Google has to recrawl and process the redirect — but it is the right fix, because leaving the pair live means the crawler keeps dividing the signal until it picks a canonical itself, with no guarantee it picks the slug you would have chosen.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.