What exactly counts as an orphan page in this rule?

A page with exactly 0 inbound internal links from any other page in the same crawl. The rule counts the links the crawler followed between pages it visited, and any URL that no visited page references is an orphan. The homepage (root URL) is exempt, because it is reached directly rather than through an internal link, so the rule never flags it.

My orphan page is in my XML sitemap. Doesn't that make it reachable?

A sitemap is a list of suggestions, not a navigable path. Google may still discover a sitemap-only URL, but it receives none of the internal PageRank that a real inbound link carries, so it tends to rank poorly and gets crawled less often. This rule deliberately measures internal links rather than sitemap membership, which is why a sitemapped page can still be flagged as an orphan.

Why does the rule fire at error severity instead of a warning?

Because an unreachable page is a structural defect, not a matter of taste. A URL that nothing links to is functionally invisible to crawlers navigating your site, and invisibility is the most expensive SEO outcome there is — the page cannot rank for anything if a crawler never arrives. Error severity signals that this should be fixed before stylistic concerns, since no amount of content quality helps a page nobody can reach.

I run a beekeeping-supplies shop and 286 hive products got flagged. How do I clear them fast?

The cause is almost always a truncated index. If your /hives/ catalog template paginates to the first 24 of 310 SKUs, then 286 brood boxes, queen excluders, and Langstroth frames have zero inbound links. Rebuild the index as a fully crawlable, filterable grid and add a 'pairs with this hive' cross-link block — a $39 smoker linking to its apiary-starter bundle, the honey extractor linking to its frames. Re-crawl and the inbound count for each product rises above 0, clearing all 286 findings in one pass — in one illustrative run the orphaned pages began earning impressions roughly 9 days after the links shipped.

Does the rule consider external backlinks when deciding if a page is orphaned?

No. The rule is corpus-scoped: it only knows about the pages in the crawl and only counts links between them. An external site might link to your orphan, which would help Google discover it, but the rule cannot see that and judges your internal link graph alone. The goal is to surface pages your own site fails to connect, since those are the ones within your control to fix.

Could fixing orphans accidentally create a different problem?

It can if you over-correct. Dumping links to 3,800 orphans into a single footer or a sitewide block restores reachability but can dilute internal PageRank and trip link-graph rules that watch for unnatural link density. The better fix is contextual: link each page from a genuinely relevant hub or sibling, so the link makes sense to a reader and the crawler, rather than wiring every orphan into one indiscriminate index.

Rule referencelinks/orphan-pages

Orphan Pages — URLs No Other Page Links To

links/orphan-pages scans every URL in the crawl, counts the inbound internal links pointing at each one, and fires at error severity on any page with exactly 0 of them — the dead-zone shape that leaves Googlebot unable to reach a URL through your own navigation, a structural gap the March 27, 2026 core update treats as a discoverability failure rather than a content one.

Test this rule on your site →Run a full audit

Test your site for orphan pages — urls no other page links to

What it detects

links/orphan-pages builds one number for every page in the crawl: how many other pages in the same corpus link to it. It walks each parsed page, reads the inbound-link count the crawler accumulated while following internal hrefs, and flags any URL whose count is exactly 0. The root URL is exempted — your homepage is reached directly, not via an internal link — so the rule never accuses the front door of being unreachable.

The check is corpus-scoped, which is the detail that makes it honest. It only knows about pages the crawl actually visited and only counts links between those pages. A URL with zero inbound links is one that no page in the set references, meaning a crawler arriving at your homepage has no internal path to it. The page might still be reachable through your XML sitemap or an external backlink, but inside the site's own link graph it is an island.

Every orphan emits a single error-severity finding naming the URL and recommending you link to it from a relevant hub or index and add it to navigation. The rule reasons purely about reachability — it makes no judgement about whether the page's content is good, only about whether anything points at it.

Why it matters

Search engines discover most pages by following links. Googlebot starts somewhere it already knows — usually your homepage or a sitemap entry — and crawls outward along internal hrefs. A page with zero inbound internal links sits outside that graph: nothing on your site points a crawler toward it, so it competes for discovery and crawl budget at a severe disadvantage even when its content is excellent.

Orphans are a classic failure mode of programmatic builds. A template generates 4,000 location pages and writes them to disk, but the index that should link them is paginated to show only the first 200, or the generation job ships the detail pages a week before the hub that lists them. The pages exist, return 200, and may even sit in the sitemap — yet no human or crawler can navigate to 3,800 of them without typing the URL. PageRank, the internal-link signal Google has used since 1998, never flows to a page nothing links to, so orphans tend to rank far below their integrated siblings.

The error severity reflects that this is a structural defect, not a stylistic one. A page no one can reach is functionally invisible, and invisibility is the most expensive SEO problem there is.

A page that fails

A beekeeping-supplies shop ships a /hives/ catalog whose index template paginates to the first 24 products, but the store stocks 310 SKUs. The $420 cedar Langstroth deep brood box, the nuc box, and roughly 280 other hive components live at real URLs that return 200, yet no page in the crawl links to them. The rule counts 0 inbound internal links for each and fires at error severity 286 times, naming every unreachable product. Googlebot arriving at the homepage has no internal path to 92% of the hive inventory, and 3 months after launch those pages still hold no rankings.

A page that passes

The same beekeeping-supplies shop rebuilds the /hives/ index as a fully linked, filterable grid — every brood box, queen excluder, and Langstroth frame is reachable from the catalog, and each product also appears in a 'goes with this hive' block on related pages, so a smoker links to the apiary-starter bundle and the honey extractor links back to the frames it spins. Every one of the 310 SKUs now carries at least 1 inbound internal link. The rule counts no zero-inbound URLs and stays silent, because Googlebot can walk from the homepage to any product in 3 clicks.

Internal Link Architecture

A correctly structured link silo feeds authority to parent hubs while avoiding dead-end loops or orphan island pages.

Recommended Anchor Text Distribution

Anchor Type	Optimal Ratio	Example
Exact Match Keyword	10% - 15%	"thin content SEO"
Partial Match / LSI	30% - 40%	"learn about doorway patterns"
Branded / Generative	Remaining	"pseolint platform"

How to fix it

1Link every orphan from a relevant hub or category index so it joins the site's internal link graph and a crawler can actually reach it.
2Fix paginated or truncated index templates that list only the first N items — the missing children are usually the orphans, and crawlable pagination restores them all at once.
3Add the page to your primary or contextual navigation when it is genuinely important, so it earns inbound links from high-traffic parts of the site.
4Cross-link related items to each other, so a product, article, or location references its siblings instead of depending on one fragile index page.
5Re-crawl after wiring the links and confirm the inbound count is no longer 0 — a sitemap entry alone does not clear this rule, because the rule measures internal links, not sitemap membership.
6For pages that should not exist as standalone URLs, consolidate or noindex them rather than leaving unreachable thin pages stranded in the corpus.

SpamBrain context

Orphan detection predates the spam era — it is plain crawlability hygiene that Google has documented for as long as it has explained how discovery works. A page nothing links to cannot accumulate the internal PageRank that has shaped ranking since 1998, and Googlebot's own crawl documentation is explicit that links are the primary discovery mechanism.

links/orphan-pages (in @pseolint/core, MIT-licensed at github.com/ouranos-labs/pseolint) sits in the structural integrity family rather than the spam family, but it matters disproportionately on programmatic sites because bulk generation is exactly where orphans appear at scale. The March 27, 2026 core update sharpened scrutiny of programmatic corpora, and a template that emits thousands of unlinked pages presents two problems at once: the pages waste crawl budget Google would rather spend elsewhere, and their existence inflates a site's apparent page count without any of them being reachable or rankable.

What the rule cannot see is your sitemap or your external backlinks. It judges the internal link graph alone, so it can flag a page as an orphan even when a sitemap lists it — which is intentional. Sitemap inclusion is a hint, not a navigable path, and Google has repeatedly said a strong internal link is worth more than a sitemap row.

Frequently asked questions

What exactly counts as an orphan page in this rule?: A page with exactly 0 inbound internal links from any other page in the same crawl. The rule counts the links the crawler followed between pages it visited, and any URL that no visited page references is an orphan. The homepage (root URL) is exempt, because it is reached directly rather than through an internal link, so the rule never flags it.
My orphan page is in my XML sitemap. Doesn't that make it reachable?: A sitemap is a list of suggestions, not a navigable path. Google may still discover a sitemap-only URL, but it receives none of the internal PageRank that a real inbound link carries, so it tends to rank poorly and gets crawled less often. This rule deliberately measures internal links rather than sitemap membership, which is why a sitemapped page can still be flagged as an orphan.
Why does the rule fire at error severity instead of a warning?: Because an unreachable page is a structural defect, not a matter of taste. A URL that nothing links to is functionally invisible to crawlers navigating your site, and invisibility is the most expensive SEO outcome there is — the page cannot rank for anything if a crawler never arrives. Error severity signals that this should be fixed before stylistic concerns, since no amount of content quality helps a page nobody can reach.
I run a beekeeping-supplies shop and 286 hive products got flagged. How do I clear them fast?: The cause is almost always a truncated index. If your /hives/ catalog template paginates to the first 24 of 310 SKUs, then 286 brood boxes, queen excluders, and Langstroth frames have zero inbound links. Rebuild the index as a fully crawlable, filterable grid and add a 'pairs with this hive' cross-link block — a $39 smoker linking to its apiary-starter bundle, the honey extractor linking to its frames. Re-crawl and the inbound count for each product rises above 0, clearing all 286 findings in one pass — in one illustrative run the orphaned pages began earning impressions roughly 9 days after the links shipped.
Does the rule consider external backlinks when deciding if a page is orphaned?: No. The rule is corpus-scoped: it only knows about the pages in the crawl and only counts links between them. An external site might link to your orphan, which would help Google discover it, but the rule cannot see that and judges your internal link graph alone. The goal is to surface pages your own site fails to connect, since those are the ones within your control to fix.
Could fixing orphans accidentally create a different problem?: It can if you over-correct. Dumping links to 3,800 orphans into a single footer or a sitewide block restores reachability but can dilute internal PageRank and trip link-graph rules that watch for unnatural link density. The better fix is contextual: link each page from a genuinely relevant hub or sibling, so the link makes sense to a reader and the crawler, rather than wiring every orphan into one indiscriminate index.

How this shows up in practice

Blackthorn Sports Academy generates a season-specific page for every coached athlete -- 930 URLs under /athletes/{season}/{name}/. The sitemap declares all 930, but internal navigation only links to the current-season roster. Pages from seasons 2021 through 2023 -- 614 URLs -- carry zero inbound internal links from any other page in the crawl. links/orphan-pages fires at error severity on each of them; the root URL is exempted by the rule, but no athlete archive page, no breadcrumb, and no related-content widget points at Jemima Oduya's 2022 profile or Marcus Threlfall's 2021 page. Googlebot can reach the URLs only via the declared sitemap, not through crawl. Coach directory lead Priya Ashford added a paginated archive widget at /athletes/archive/ linking every prior season, clearing the orphan flag across all 614 historical pages.

Sources

Google Search Central — Large site owner's guide to managing crawl budget — Google's crawl-budget guidance explains that Googlebot discovers pages by following internal links — a URL with zero inbound internal links sits outside every navigation path, so the crawler cannot reach it organically. links/orphan-pages flags exactly this zero-inbound-link condition, counting the inbound-link total the crawler accumulated while walking internal hrefs and exempting only the root URL, which is reached directly.
Google Search Central — Build and submit a sitemap — Google's sitemaps documentation acknowledges that a sitemap can surface URLs Googlebot would not find through crawling alone, but also warns that a sitemap submission does not substitute for internal links that transfer authority. A page with zero inbound internal links survives only on sitemap declaration — a structurally weak position the March 27, 2026 core update treated as a discoverability failure.
Google Search Central — Search Essentials — Search Essentials states that Google must be able to find a page to index it, and the primary discovery mechanism is link-following. When links/orphan-pages fires at error severity on a URL, that page has no internal referrer in the crawled corpus at all — it cannot be found by traversing your own navigation, only by direct URL knowledge or an external signal.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.

Open the spambrain checker All rules