[SCRAPEBOX]How To Harvest "Pages With No Comments" From A Blog/Site

cyberzilla

Elite Member
Nov 15, 2009
2,462
3,595
If you are in hunt for high PR pages to drop your comments for backlinks, you would usually come across many do-follow and auto-approved high PR pages which is spammed to death. Most people ignore those pages or simply drop their comments on those pages which is already flooded by spam comments. Don't do that again! instead harvest other pages from the same blog because you will find many hidden high PR pages with less outbound links(OBL).

I'm gonna show you how I harvest pages from a blog using Scrapebox(Not the typical way). The common footprint which we use is "site:websitename.com", but this will scrape ALL the pages which includes pages where we can't add comments. So we need to fine tune our footprint to find ONLY pages which allows you to drop comments. For demonstration
purpose, I'm gonna use the below given page. It is a PR1, auto-approved and do-follow page

Code:
http://geekswithblogs.net/tkokke/archive/2009/05/31/sending-email-from-php-on-windows-using-iis.aspx
Let's take a closer look at the comment section in that page:

nadap065710640930232.jpg


These two phrases are common "Post A Comment" and "Enter the code shown above". Let's use this as foot print. So here is my final footprint.

Code:
site:geekswithblogs.net "Post A Comment" 

site:geekswithblogs.net "Enter the code shown above"
So this footprint only harvest pages with comment section enabled. It saves you hell a lot of time! Use the above as a "Custom Footprint" and harvest all the pages using scrapebox. Remove all the high OBL pages using add-on "OutBound Link Checker".

You may ask, is there a way to find pages with ZERO comments?

Answer is yes! You have to find the recent post(make sure it has NO comments) and check for common phrase in that page.For example, here is the recent post from that blog with no comments.

Code:
http://geekswithblogs.net/tkokke/archive/2010/11/23/html5-ndash-an-introduction---code-and-slides.aspx
Now let's check the comment section

nadap065710133361232.jpg


If there is no comments in a page, you will find this phrase "No comments posted yet". So you just need to harvest using the below given footprint:

Code:
site:geekswithblogs.net "No comments posted yet"
Use the above as a "Custom Footprint" and harvest all the pages using scrapebox. I just scraped all those no comment pages for you. Here it is in a txt file sorted by PR.

Code:
http://www.mediafire.com/?jzkwuwb0sjw3zil
I tried my level best to explain this with my bad english so If you don't understand, just let me know.
 
Last edited:
Great post OP. thanks for helping out on this, i hope it will help lots people here.

+Rep given :)
 
Great post OP. thanks for helping out on this, i hope it will help lots people here.

+Rep given :)

Glad you like it! The footprint "No comments posted yet" is not common for all blogs/sites. So you may come across many different phrases. For example,

1. Why not be the first to have your say
2. There are currently no comments for this post

You can build you own list of footprints like this to find fresh pages to drop your backlinks!
 
Greaaat idea. I just checked some High PR sites I have collected with a lot of OBL, and I found other pages with far more less OBL or even none in the comments.

I thought about this before, but I thought "naahh it takes to long to check all the inner pages with these slow ass online tools", but guess I was to dumb to use Scrapebox doing it myself :D
 
Greaaat idea. I just checked some High PR sites I have collected with a lot of OBL, and I found other pages with far more less OBL or even none in the comments.

I thought about this before, but I thought "naahh it takes to long to check all the inner pages with these slow ass online tools", but guess I was to dumb to use Scrapebox doing it myself :D

LOL here is another tips for you! Pages with less comments doesn't hurt :) So let say if you want to find pages with only one comments. Try this footprint:

Code:
site:blog.mozilla.com "Comments: 1"
Again "Comments: 1" is not common for all blogs/sites. It may vary a little bit like Comments(1), 1 comment so far......hope you get my point:D
 
Code:
"No comments posted yet"
Is a bit non-specific.
It's better to really define what you want with it by using allintext operator.

Code:
site:http://someurl.com +allintext:"No comments posted yet"
Maybe add a keyword via insubject

Code:
site:http://someurl.com +allintext:''No comments posted yet" +insubject:"your keywords"
Else google will also try to find urls that match it or will turn up partial matches.
This way you sift out the useless results. Which will saves you time scraping and/or might yield more useful results taking the 1000 result limit into account.

Very good idea, will try it today.
 
If you are in hunt for high PR pages to drop your comments for backlinks, you would usually come across many do-follow and auto-approved high PR pages with is spammed to death....

Excellent thanks for the tip. I've been trying to figure out how to do something like that but just couldn't 'get it'. This method should also be able to help cure the problem of 'harvesting by keywords found in the spam/comments'

My niche doesn't have too many keyword options and I keep harvesting pages which have already been commented on by people using those keywords as part of their own description to describe their niche. In my case, they are usually 'selling' stuff that people like me 'actually make'. So instead of getting pure-blog relevant links, I'm getting comment-relevant links.

So what I'm going to try is your method and re-harvest the list looking for my niche keywords. Might have to do it the otherway round but its certainly going to be better than what I'm doing at the moment.

Merry Christmas to you sir. Great gift is your tip.
 
Last edited:
Code:
"No comments posted yet".....[/QUOTE]
 
Ahhhhhhhh!!!!! ha haaa of course.... Thanks very much,,, shame i didn't see this before posting my reply couple minutes ago. System was still processing I guess... This is excellent... just what I need.
 
Code:
"No comments posted yet"
Is a bit non-specific.
It's better to really define what you want with it by using allintext operator.

Code:
site:http://someurl.com +allintext:"No comments posted yet"
Maybe add a keyword via insubject

Code:
site:http://someurl.com +allintext:''No comments posted yet" +insubject:"your keywords"
Else google will also try to find urls that match it or will turn up partial matches.
This way you sift out the useless results. Which will saves you time scraping and/or might yield more useful results taking the 1000 result limit into account.

I tried that footprint, but G00gle doesn't show any results when I use allintext operator. Not sure if I'm using the operator correctly!

Code:
site:geekswithblogs.net +allintext:''No comments posted yet"
But the same works when I use "intext" operator

Code:
site:geekswithblogs.net +intext:''No comments posted yet"
 
Good job thinking this through and then finding the proper indicators very helpful info indeed. Thanks for sharing your discovery.
 
very very nice. I was actually too lazy to try doing this with slower methods, but I got new found motivation. =D
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock