1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Spammy Internal Links

Discussion in 'White Hat SEO' started by tasburrfoot, Oct 22, 2016.

  1. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    Got my crawl results back from google, and the Internal Links looks kind of spammy to me. Pic attached for actual numbers, but like 30k links to one page, 22k, 11k, lots of 10k's, pretty much every page on the site has 2000+ internal links to it.

    Does this harm my rankings?
     

    Attached Files:

  2. marzipaan

    marzipaan Newbie

    Joined:
    Oct 22, 2016
    Messages:
    6
    Likes Received:
    1
    Gender:
    Male
    Its okay, most authority sites would have them. You could nofollow them if you are worried.
     
  3. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    9,848
    Likes Received:
    7,468
    Home Page:
    How big is your site?
     
  4. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    It's an e-commerce site, with about 20 static pages and 90 products. So no where near that large. Most of those links are being generated via the "search" function on the site, is my assumption.

    Would it be better to configure robots to disallow bots/crawlers from going through the search?
     
  5. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    9,848
    Likes Received:
    7,468
    Home Page:
    Strange. How would robots index search though? Surely you have to type in a query to get to a results page?
     
  6. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    Yeah, but they definitely crawl through the search. Some of my search results are popping up on the SERPs. I think it's because every query get's automatically saved as a "recommended" item, which may be prompting the bot to go through them?

    Not sure on the crawling behavior, but I guess I could just trial and error this and post my results.. I'll block robots from using the search function, and reindex everything and see how things go!
     
    • Thanks Thanks x 1
  7. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    9,848
    Likes Received:
    7,468
    Home Page:
    Ok that makes sense. So it's the recommended function. Yea you should probably no-follow / no-index those recommended links/pages - that will solve your problem and also probably stop a load of pages getting indexed and looking like duplicate content.
     
  8. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    Yeah, after reading around I found this link for Magento(which I'm using)
    http://inchoo.net/ecommerce/ultimate-magento-robots-txt-file-examples/

    All of the resources for SEO/etc I've come across relating to Magento optimization have NEVER mentioned anything about modifying the robots.txt, but apparently it will hurt your results, as you mentioned above by creating Duplicate content, incredibly spammy internal linking, etc.

    This also looks like it will fix a couple of other issues I've come across while using magento(for some reason google breaks href="/page.htm" and instead of the expected result of going to "domain.com/page.htm" it will actually append it like so "domain.com/currentpage.htm/page.htm" and thus return 404's in my Crawl Errors. No clue why, but it's been a pain in the ass elongating all of these links as they come up.

    Robots I'm testing:
    Code:
    # Google Image Crawler Setup
    User-agent: Googlebot-Image
    Disallow:
    
    # Crawlers Setup
    User-agent: *
    
    # Directories
    Disallow: /404/
    Disallow: /app/
    Disallow: /cgi-bin/
    Disallow: /downloader/
    Disallow: /errors/
    Disallow: /includes/
    #Disallow: /js/
    #Disallow: /lib/
    Disallow: /magento/
    #Disallow: /media/
    Disallow: /pkginfo/
    Disallow: /report/
    Disallow: /scripts/
    Disallow: /shell/
    Disallow: /skin/
    Disallow: /stats/
    Disallow: /var/
    
    # Paths (clean URLs)
    Disallow: /index.php/
    Disallow: /catalog/product_compare/
    Disallow: /catalog/category/view/
    Disallow: /catalog/product/view/
    Disallow: /catalogsearch/
    #Disallow: /checkout/
    Disallow: /control/
    Disallow: /contacts/
    Disallow: /customer/
    Disallow: /customize/
    Disallow: /newsletter/
    Disallow: /poll/
    Disallow: /review/
    Disallow: /sendfriend/
    Disallow: /tag/
    Disallow: /wishlist/
    Disallow: /catalog/product/gallery/
    
    # Files
    Disallow: /cron.php
    Disallow: /cron.sh
    Disallow: /error_log
    Disallow: /install.php
    Disallow: /LICENSE.html
    Disallow: /LICENSE.txt
    Disallow: /LICENSE_AFL.txt
    Disallow: /STATUS.txt
    
    # Paths (no clean URLs)
    #Disallow: /*.js$
    #Disallow: /*.css$
    Disallow: /*.php$
    Disallow: /*?SID=
     
    • Thanks Thanks x 1
  9. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    Just an update on this, you also have to allow /skin/, either remove or #comment it out. Otherwise google considers your whole site mobile unfriendly.