1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Duplicate content issue with ecommerce site?

Discussion in 'Black Hat SEO' started by 1337WuLF, Jul 21, 2016.

  1. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    Hi guys,

    I have an issue with a clients site.

    They have over 600 products. The problem is that Google has indexed all of their products more than once with different URLs even though I have a rel=canonical tag pointing to the correct product pages.

    For example:

    Code:
    (Correct URL) https://domainname.com/commercial-refrigeration/display-fridge/2000mm-wide-cake-display-fridge/ 
    
    https://domainname.com/product/2000mm-wide-cake-display-fridge 
    
    https://domainname.com/shop/2000mm-wide-cake-display-fridge
    
    https://domainname.com/2000mm-wide-cake-display-fridge 

    Is this a problem since I am using the rel=canonical tag? Is it going to affect the sites rankings?

    Ideally I'd only like Google to index the URLs I've specified in the canonical tag.

    If this IS a problem, how can I go about fixing it?

    Should I just 301 redirect all of these incorrect links Google has indexed?


    Cheers!
    Aaron
     
  2. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    Is anyone able to help me out with this?
     
  3. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    Bump, need to get this sorted out
     
  4. Kronic

    Kronic Registered Member

    Joined:
    Apr 11, 2010
    Messages:
    73
    Likes Received:
    20
    Occupation:
    Nismo Whore
    Location:
    Within a small corner in the back shadows of my ab
    bump and sub'd for more info.

    I'm having a very similar situation and have not been able to find any answers to date.
     
  5. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    I've chosen to trust that the canonical tag will be sufficient, however more insight on this would be greatly appreciated.
     
  6. Unreliable Witness

    Unreliable Witness Regular Member

    Joined:
    Apr 21, 2016
    Messages:
    294
    Likes Received:
    154
    Do you have access to a coding team? If so, you could add the robots directive "follow, noindex" to the pages that you don't want to show.

    You could also maybe sort it out in robots.txt using wildcards. For example, block everything in /shop/ and /product/ and that contains *fridge*

    You will then need Google to index all the bad pages again. The best way to do this is to Fetch as Google each page in WMT. If there are a lot of pages, you could do a bulk fetch by crawling all links on a page you create, but there seems to be a limit of about 200 links that Google will follow, so you might need to do this in batches. Doing it page by page is time consuming, but seems to work better.

    You will then need to redo this work in a month. Pages don't drop immediately. On a client site with similar issues, it took 9 months for all the pages to be requested often enough for Google to realise not to crawl them any longer. Google doesn't deindex the first time around in case you make a mistake. Just like you might index a page, you could also throw social signals at the ones you want to deindex. That seems counter-intuitive, but what you want to do is get Google to visit those pages as often as possible to see that it should drop them.

    Hope this helps.
     
  7. Unreliable Witness

    Unreliable Witness Regular Member

    Joined:
    Apr 21, 2016
    Messages:
    294
    Likes Received:
    154
    Sorry, in robots.txt, you don't want to block, you want to noindex as well.
     
  8. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    Thanks so much for the reply.

    Do you think that is necessary though given I am using the rel=canonical tag which points to the correct page?

    I suppose my question is, if I do nothing about this, is this going to affect my clients rankings?
     
  9. RightFootFanatic

    RightFootFanatic Regular Member

    Joined:
    May 31, 2015
    Messages:
    348
    Likes Received:
    194
    Occupation:
    DevOps
    Location:
    Whimsyshire
    Which shopsystem?
    Are those sites all accessible?
    Are the sites all in the sitemap.xml?
    Are the urls in the tags correct absolute or relatives?
    Is there only one canonical tag in the pages? (check source)
     
  10. 1337WuLF

    1337WuLF Senior Member Premium Member

    Joined:
    Jan 3, 2011
    Messages:
    951
    Likes Received:
    378
    Occupation:
    IM
    Location:
    On a rock in space
    Woocommerce.
    If by sites you mean pages, yeah, all of the duplicate URLs are accessible.
    The incorrect URLs are not in the sitemap, the correct URLs are in the sitemap.
    Not sure what you mean by this question sorry, are you able to explain>
    Yep, just one tag.

    Thanks!