Rule referencecontent/regurgitated-content

Regurgitated Content — When Your Directory Is Just the Google Places API Reskinned

content/regurgitated-content is a low-confidence v1 heuristic that fires a warning when a page shows at least 2 of 5 Google-Places-regurgitation tells — Powered by Google attribution, googleusercontent images over 60%, a Static Maps embed, Places API JavaScript, or an aggregator footprint of 5 or more unsigned star-rating blocks.

Test your site for regurgitated content — when your directory is just the google places api reskinned

Loading bot check… if this doesn't resolve in a few seconds, refresh the page.

We'll surface findings tagged with `content/regurgitated-content`.

What it detects

content/regurgitated-content looks for one shape: a page that lifts business names, reviews, addresses, and photos straight from the Google Places API and presents them as a directory with nothing of its own added on top. It reads five independent signals per page and fires only when at least 2 of them are present.

The signals are specific. (1) Google Places attribution — a 'powered by google' string, or a noopener anchor pointing at google.com/maps. (2) Google images dominate — once a page has 3 or more images, the rule fires this signal when over 60% of them are hosted on googleusercontent.com, the Places photo endpoint, or Street View pixels. (3) Static Maps or Maps embed — a maps.googleapis.com/maps/api/staticmap source, or a google.com/maps/embed iframe. (4) Places API JavaScript — a google.maps.places.PlacesService or AutocompleteService marker in the markup. (5) Aggregator footprint — 5 or more elements carrying a star rating (Unicode stars, a 4.5/5 fraction, or the word 'stars') on a page that shows fewer than 2 of 3 E-E-A-T signals (author, published date, an /about link).

Severity is fixed at warning and confidence is low. This is a v1 heuristic that reasons about structure, never about a licence: it cannot read a Places API contract or know whether you have permission. It only sees the fingerprint that raw redistribution leaves behind.

Why it matters

The Places API is a fine data source. The problem this rule names is using it as the entire product — a redistribution layer with no proprietary value, where every fact, photo, and rating on the page is something a reader could have pulled from Google Maps in one tap. When a directory adds nothing a user cannot already get from the source, the page is competing with Google using Google's own data, which is a losing position in the index and an obvious scaled-content tell.

The 2-of-5 threshold is deliberately loose because each signal alone is innocent — plenty of legitimate pages embed one map. Two signals together start to describe a page whose substance is borrowed: Google-hosted photos plus a Static Maps embed, or Places attribution plus a wall of unsigned star ratings. The pattern, not any single tell, is what the heuristic is reaching for.

Because confidence is deliberately low, a finding here is a prompt to audit, not a verdict. A genuine local guide that embeds a map and quotes a couple of reviews can trip two signals while adding real editorial value the rule cannot see. Treat the warning as 'this page looks like a thin redistribution layer — confirm it adds something the API does not.'

A page that fails

TikiFinder, a 600-page craft-cocktail-lounge directory, ships a page per bar that is pure Places API reskin. The lounge's name, address, and 5 most recent reviews come straight from the API; 9 of its 11 photos are googleusercontent.com hero shots of the bar's signature mai tai and ceramic tiki mugs (82% Google-hosted); a Static Maps embed pins the entrance; and a star-rating block repeats '4.6/5 stars' under every review with no byline, no published date, no /about page. Four of the five signals trip. There is not one sentence about the rum flight, the bitters program, or the garnish work that a reader could not have read on Google Maps 12 seconds earlier.

A page that passes

The same TikiFinder page, rebuilt as an actual guide. The embedded map and a single attributed Google review stay — that is fine — but the page now leads with 300 words the API does not hold: the editor visited, ranked the lounge's 8 rum flights, photographed the house orgeat and the hand-carved tiki mug collection with the directory's own camera (so only 18% of images are Google-hosted), and named the bartender who built the bitters menu in a signed byline with a published date. Two Places signals remain, but the page now carries proprietary tasting notes, original garnish photography, and a named author — substance the raw Places API never had.

How to fix it

  1. 1Add proprietary substance the API does not hold — original tasting notes, a ranked verdict, a first-person visit log — so the page is more than a redistribution layer.
  2. 2Shoot and host your own photography. When your own images outnumber googleusercontent.com hero shots, the Google-images-dominate signal stops firing and the page stops looking lifted.
  3. 3Keep one attributed Google review if you like, but write your own editorial summary alongside it rather than republishing a wall of 5-plus star-rating blocks verbatim.
  4. 4Attach E-E-A-T: a named byline, a published date, and an /about page describing how you evaluate each venue, which both clears the aggregator-footprint signal and answers the trust question.
  5. 5Use the embedded map as a convenience, not the content — one Static Maps embed is fine when the words around it are yours and not the API's.
  6. 6If a page genuinely has nothing to add beyond the Places data, merge it or cut it rather than shipping a thin reskin that competes with Google using Google's own facts.

SpamBrain context

Google's scaled-content-abuse policy, effective March 5, 2024, targets pages produced at scale that add little value of their own regardless of how they were made — and a directory that is a thin wrapper over the Places API is one of the cleanest examples. The data is accurate, the page renders fine, and yet the URL contributes nothing a reader could not get from the source in one tap. That is the gap between a database export and a page worth ranking.

content/regurgitated-content (in @pseolint/core, MIT-licensed at github.com/ouranos-labs/pseolint) is a v1 heuristic, and it is honest about its limits. It reads five structural tells and fires at warning with low confidence on 2 or more, because that is the level of certainty a structure-only check can responsibly claim. It does not run external corpus comparison — n-gram overlap against Wikipedia or review aggregators is deferred to a later version — so it cannot prove a page is regurgitated, only that it wears the fingerprint.

What it cannot do is read intent or licence. It sees Google-hosted photos, a Static Maps embed, and a wall of unsigned ratings, and it tells you the page looks like a redistribution layer with no proprietary value. Whether that is true depends on whether you added anything the API does not already give a reader for free — a judgment only your content can settle.

Frequently asked questions

Why does my legitimate local guide trip this rule?
Because two innocent signals can co-occur. A genuine guide that embeds a Static Map and quotes one Google review will trip 2 of the 5 tells even though it adds real editorial value. This is a low-confidence v1 heuristic — it reads structure, not substance, so it cannot see your original tasting notes or your on-the-ground reporting. Treat a finding as a prompt to confirm the page adds something the Places API does not, not as a verdict that it is spam.
Is embedding a Google Map a problem on its own?
No. One map embed is a single signal, and the rule needs at least 2 of the 5 to fire. Maps are a useful convenience and plenty of valuable pages use them. The pattern the heuristic is reaching for is a map plus Google-hosted photos plus lifted reviews plus no authorship — the combination that describes a page whose entire substance is borrowed from the API rather than the embed alone.
We run a tiki-bar directory that embeds Google reviews — how do we pass?
Add proprietary value the Places API does not hold, then the borrowed pieces stop defining the page. For a craft-cocktail lounge that means your own ranked verdict on its 8 rum flights, original photography of the mai tai and the hand-carved tiki mugs so your images outnumber googleusercontent ones, signed editorial notes on the bitters and garnish program, and a named byline with a published date. Keep one attributed review if you want — but make the page about your judgment, not a reskin of Google Maps.
Why is confidence low and severity only a warning?
Because a structure-only heuristic cannot prove regurgitation — it can only spot the fingerprint. A page that lifts everything from the API and a thoughtful local guide that happens to embed a map can look similar in markup, so the rule deliberately under-claims: warning severity, low confidence, and a 2-of-5 threshold chosen to surface the pattern without crying spam on every page with a map. A future version may add external corpus comparison to raise confidence; v1 stays honest about what markup alone can tell.
What counts as the aggregator-footprint signal exactly?
It fires when a page shows 5 or more elements carrying a star rating — Unicode stars, a numeric fraction like 4.6/5, or the literal word 'stars' — while exposing fewer than 2 of 3 E-E-A-T signals (an author, a published date, or an /about link). It is the shape of a review-aggregator page that republishes ratings at scale without taking responsibility for them. Add a byline and an /about page and this signal stops firing, because the page is no longer anonymous redistribution.

Related rules

Want to know whether this rule actually fires on your site?

Run pseolint against your sitemap. The audit is free, takes about a minute, and returns a per-URL list of every rule that fired — including this one — with the exact metric values so you can prioritise the fix queue.