Spammy Internal Links

tasburrfoot

Regular Member
Joined
Dec 16, 2008
Messages
338
Reaction score
162
Got my crawl results back from google, and the Internal Links looks kind of spammy to me. Pic attached for actual numbers, but like 30k links to one page, 22k, 11k, lots of 10k's, pretty much every page on the site has 2000+ internal links to it.

Does this harm my rankings?
 

Attachments

  • internal links.png
    internal links.png
    40.4 KB · Views: 79
Its okay, most authority sites would have them. You could nofollow them if you are worried.
 
How big is your site?
 
How big is your site?
It's an e-commerce site, with about 20 static pages and 90 products. So no where near that large. Most of those links are being generated via the "search" function on the site, is my assumption.

Would it be better to configure robots to disallow bots/crawlers from going through the search?
 
Strange. How would robots index search though? Surely you have to type in a query to get to a results page?
 
Strange. How would robots index search though? Surely you have to type in a query to get to a results page?
Yeah, but they definitely crawl through the search. Some of my search results are popping up on the SERPs. I think it's because every query get's automatically saved as a "recommended" item, which may be prompting the bot to go through them?

Not sure on the crawling behavior, but I guess I could just trial and error this and post my results.. I'll block robots from using the search function, and reindex everything and see how things go!
 
what are some other ways to monetize videos if you got booted from adsense? and don't the adblockers stop adsense?

Yeah, but they definitely crawl through the search. Some of my search results are popping up on the SERPs. I think it's because every query get's automatically saved as a "recommended" item, which may be prompting the bot to go through them?

Not sure on the crawling behavior, but I guess I could just trial and error this and post my results.. I'll block robots from using the search function, and reindex everything and see how things go!

Ok that makes sense. So it's the recommended function. Yea you should probably no-follow / no-index those recommended links/pages - that will solve your problem and also probably stop a load of pages getting indexed and looking like duplicate content.
 
Ok that makes sense. So it's the recommended function. Yea you should probably no-follow / no-index those recommended links/pages - that will solve your problem and also probably stop a load of pages getting indexed and looking like duplicate content.
Yeah, after reading around I found this link for Magento(which I'm using)
http://inchoo.net/ecommerce/ultimate-magento-robots-txt-file-examples/

All of the resources for SEO/etc I've come across relating to Magento optimization have NEVER mentioned anything about modifying the robots.txt, but apparently it will hurt your results, as you mentioned above by creating Duplicate content, incredibly spammy internal linking, etc.

This also looks like it will fix a couple of other issues I've come across while using magento(for some reason google breaks href="/page.htm" and instead of the expected result of going to "domain.com/page.htm" it will actually append it like so "domain.com/currentpage.htm/page.htm" and thus return 404's in my Crawl Errors. No clue why, but it's been a pain in the ass elongating all of these links as they come up.

Robots I'm testing:
Code:
# Google Image Crawler Setup
User-agent: Googlebot-Image
Disallow:

# Crawlers Setup
User-agent: *

# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
#Disallow: /js/
#Disallow: /lib/
Disallow: /magento/
#Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/

# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
#Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /catalog/product/gallery/

# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt

# Paths (no clean URLs)
#Disallow: /*.js$
#Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?SID=
 
Just an update on this, you also have to allow /skin/, either remove or #comment it out. Otherwise google considers your whole site mobile unfriendly.
 
Back
Top