1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

ScrapeBox: Question about extracting URL's

Discussion in 'Black Hat SEO' started by CoyoteAssassin, May 4, 2011.

  1. CoyoteAssassin

    CoyoteAssassin Elite Member

    Joined:
    Jan 3, 2010
    Messages:
    1,862
    Likes Received:
    3,906
    Occupation:
    Full Time IMer
    Location:
    USA
    I have ScrapeBox, I've added the ScrapeBox Link Extractor Addon, imported my URL, and started to extract Internal URL's.

    But it only has two results.

    All of the links in the HTML are "/folder/folder2/page1.html". How do I get ScrapeBox to grab these links since they are not full URL's?

    Thanks!
     
  2. cyberzilla

    cyberzilla Elite Member Premium Member

    Joined:
    Nov 15, 2009
    Messages:
    2,204
    Likes Received:
    3,364
    Location:
    zeta reticuli
    Sometimes Link Extractor Addon acts weird for me. When I choose internal, I get few internal links, but when I choose the option "both" I get all the internal as well as external links. So try to extract both link and see how it works.

    The other option is you can try SB Sitemap Extractor if that site has sitemap. This works perfectly to scrape all internal pages.

    If there is no Sitemap or if SB doesn't recognize the sitemap, studying the URL structure is the only way, it is pretty much easy for us to grab the internal links as per our requirement. Put the site URL in with operator "site:" and see the URL structure. Most of the blogs/sites URL would be like this...

    Year wise:
    So I use the custom footprint like this
    Category wise:
    so the footprint would be
    Hope it helps...
     
    • Thanks Thanks x 1
  3. CoyoteAssassin

    CoyoteAssassin Elite Member

    Joined:
    Jan 3, 2010
    Messages:
    1,862
    Likes Received:
    3,906
    Occupation:
    Full Time IMer
    Location:
    USA
    Thanks for the reply. Nothing new for me but it is good to see that others are experiencing the same issue. I'm pretty good at building links and extracting them.

    I decided to spend 15 minutes and manually build the category links and then loaded those into SB. I then ran SB which extracted the profile pages (which were full URL's).

    Thanks.