ScrapeBox: Question about extracting URL's

CoyoteAssassin · May 4, 2011

I have ScrapeBox, I've added the ScrapeBox Link Extractor Addon, imported my URL, and started to extract Internal URL's.

But it only has two results.

All of the links in the HTML are "/folder/folder2/page1.html". How do I get ScrapeBox to grab these links since they are not full URL's?

Thanks!

cyberzilla · May 4, 2011

Sometimes Link Extractor Addon acts weird for me. When I choose internal, I get few internal links, but when I choose the option "both" I get all the internal as well as external links. So try to extract both link and see how it works.

The other option is you can try SB Sitemap Extractor if that site has sitemap. This works perfectly to scrape all internal pages.

If there is no Sitemap or if SB doesn't recognize the sitemap, studying the URL structure is the only way, it is pretty much easy for us to grab the internal links as per our requirement. Put the site URL in with operator "site:" and see the URL structure. Most of the blogs/sites URL would be like this...

Year wise:

http://website.com/2011/
http://website.com/2010/ and so on..

So I use the custom footprint like this

site:http://website.com/2011/
site:http://website.com/2010/

Category wise:

http://dogsite.com/dogfood/
http://dogsite.com/dogtraining/
http://dogsite.com/dogallergies/

so the footprint would be

site:http://dogsite.com/dogfood/
site:http://dogsite.com/dogtraining/
site:http://dogsite.com/dogallergies/

Hope it helps...

CoyoteAssassin · May 5, 2011

Thanks for the reply. Nothing new for me but it is good to see that others are experiencing the same issue. I'm pretty good at building links and extracting them.

I decided to spend 15 minutes and manually build the category links and then loaded those into SB. I then ran SB which extracted the profile pages (which were full URL's).

Thanks.

ScrapeBox: Question about extracting URL's

CoyoteAssassin

Elite Member

cyberzilla

Elite Member

CoyoteAssassin

Elite Member

Main Menu

Marketplace

Making Money

BlackHat World