Hi,
My client’s website creates nested collections urls /collections/perte-de-cheveux/loreal like this while filters show like this /collections/perte-de-cheveux?filter.p.vendor=L%27Or%C3%A9al
I am not able to identify what creates the nested urls for collections.
It’s a code thing where it seems the theme dev was not consistent of either using using {{ product.url | within: collection }} or simply {{ product.url }}.
I think it’s generally mostly accepted to stick to the simple URL. But I’m no SEO expert.
Sometimes the themes leave a setting in Theme Settings where this can be toggled. So try look for that.
If I go to the /collections/all URL, both products will be displayed.
If I go to the /collections/all/hair-loss URL (your case), only the hair-conditioner product will be displayed. The shampoo product does not have the hair-loss tag and is filtered out as a result
If I go to the /collectiosn/all/loreal URL, again, both products are displayed because both of them have the associated tag.
Those links are normal and autogenerated by tags, as he mentioned. Seems like they also are not indexed anyways. Try googling site:<domain>/collections/<collection>/<tag>.
To add the custom rule, go to the code editor, create (if not already there) the robots.txt.liquid in the templates folder. Then, paste the following code:
{% for group in robots.default_groups %}
{{- group.user_agent -}}
{% for rule in group.rules %}
{{- rule -}}
{% endfor %}
{%- if forloop.first -%}
Disallow: /collections/*/*
{% endif %}
{%- if group.sitemap != blank -%}
{{ group.sitemap }}
{%- endif -%}
{% endfor %}
The rule inside the {% forloop.first %} condition is the one to prevent crawlers from accessing the tagged collections URLs. Once the modifications are done, you can use the following tool to check if crawlers can access or not !
Note 1: Be careful not to erase potential modifications already present in the file.
Note 2: Check that this does not contradict with URLs you want indexed, and update accordingly !
We will have to redirect many of these links beforehand since they stay alive and drive a lot of trafic… even if they have not been generated by our theme since updating almost 2 years ago!