Search and discovery - Unique URLS

Short description of issue

Search and discovery - Unique URLS

Reproduction steps

N/A

Additional info

We have a client using Search and Discovery and on the collection/search page, for product URLs, it has the following on the end:

?_pos=2&_fid=6f211a8cb&_ss=c&variant=56756417986944

I understand the ?_pos is for analytics tracking for S&D. But we’re having an issue where Google Search is treating each of these links are unique pages which is causing issues. I’ve checked that when clicking on one of these links, then looking at the canonical that its correct, and it is, it contains the clean product URL.

So what could be causing this?

What type of topic is this

General discussion

Hi @Luke

That does sound unexpected, if the canonical is correct, that should guide search engines to index the main product page without extra parameters. What theme is being used here? Are you just seeing this with one merchant?

Hey @Liam-Shopify, its a custom theme, but I have checked and theres nothing I can see which would cause this, the canonical is all setup correctly as mentioned - so I’m not sure what else would cause this.

Google can and will index what it finds relevant, no matter what you may recommend via the canonical URL. Of course this can hurt SEO but as is, nothing prevents Google crawlers from indexing the incorrect page. Here are two solutions:

  1. Preventing crawlers from accessing the parameterized URLs

For this, you would need to check the robots.txt.liquid template and update it in order to restrict bots from crawling the URL altogether. Still not 100% safe as they could disregard the directive. Still, this could hurt discovery as recommended products are a good way to speed up the process of indexation.

  1. Handling the redirection client-side

Still not 100% indexing-proof but the best I got for now:

{% liquid
  assign product_url = product.url | split: '?' | first
  assign recommended_attributes = product.url | split: '?' | last
  if recommended_attributes != blank
    assign recommended_attributes = 'data-attributes="' | append: recommended_attributes | append: '"'
  endif
%}

<product-card>
  <a href="{{ product_url }}" {{ recommended_attributes }}>...</a>
</product-card>

Basically, from the perspective of most crawlers, the product card will redirect to the regular product page, eliminating the risk of indexing parameterized product URLs. And on the client-side, for example via a web component logic, on click you can rebuild the URL and redirect to it.

This way, the crawler is never bothered with parameters but actual traffic will train the S&D algorithm with real data :wink:

Hope it helps !

Thanks man.

But the issue here is that Search and Discovery is outputting these out of the box. Surely this is not wise if it causes these issues?

It’s basically the same as this topic. Google is the only judge of what should be indexed or not. Relying on tricks to prevent the dumb AI from crawling and/or indexing the wrong content is a must in this day and age.

From my experience, stripping S&D parameters and reintroducing them client-side works pretty well. Is this ideal, not really. But Google crawlers really are dumb sometimes. And once the damage is done, it can hurt.