1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[SB] Scrape all pages from blogs, even the ones not indexed in search engines

Discussion in 'Black Hat SEO' started by andreyg13, Feb 22, 2011.

  1. andreyg13

    andreyg13 Jr. VIP Jr. VIP

    Joined:
    Nov 13, 2009
    Messages:
    915
    Likes Received:
    1,774
    Occupation:
    SEO
    Location:
    http://seoshark.org
    Home Page:
    Here is a quick one that i thought might come in handy:

    1. load a huge list of domains into your keyword list using the following syntax:
    site:yoursite1.com
    site:yoursite2.com
    .............
    2. in the custom footprint add inurl:sitemap.xml
    3. hit harvest
    4. the new version has a removed the remove duplicates radio button and replaced it with remove/filter
    there you would have an option : "remove urls not containing".
    Use it and enter sitemap.xml
    5. load up the addon called sitemap scraper
    6. import from scrapebox harvester and hit start
    7. export urls to file

    Voila, there you have all the pages from all your blogs [​IMG]
     
    • Thanks Thanks x 12
    Last edited: Feb 22, 2011
  2. Kepperbes

    Kepperbes Registered Member

    Joined:
    Jan 17, 2011
    Messages:
    71
    Likes Received:
    17
    Damn, now THAT was useful right there!
     
  3. Rasputin78

    Rasputin78 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    373
    Likes Received:
    82
    Gender:
    Male
    Location:
    Ether
    Neat trick Thanks for the share :)
     
  4. AirForce1

    AirForce1 Newbie

    Joined:
    May 29, 2009
    Messages:
    15
    Likes Received:
    0
    Occupation:
    System manager of www.shoesbuyonline.com
    Location:
    under construction
    Home Page:
    Nice sharing and great idea, thanks a lot. :)
     
  5. cyberzilla

    cyberzilla Elite Member Premium Member

    Joined:
    Nov 15, 2009
    Messages:
    2,204
    Likes Received:
    3,363
    Location:
    zeta reticuli
    Ha ha I was about to post this, but you are damn quick bro! Thanks for sharing.
     
  6. andreyg13

    andreyg13 Jr. VIP Jr. VIP

    Joined:
    Nov 13, 2009
    Messages:
    915
    Likes Received:
    1,774
    Occupation:
    SEO
    Location:
    http://seoshark.org
    Home Page:
    Had this on my mind for a while :D
     
  7. Merlin22

    Merlin22 Regular Member

    Joined:
    May 19, 2010
    Messages:
    215
    Likes Received:
    32
    Do you have any idea how to take a massive list of url's and join them with the word "site:" in order to build up the keyword list?

    I've been doing it with Excel and it's a major PITA.
     
  8. partymarty4870

    partymarty4870 Elite Member

    Joined:
    Jul 7, 2010
    Messages:
    2,034
    Likes Received:
    1,690
    Location:
    I come from a land downunder
    replace function in notepad

    replace http with site:http

    or www and so on.
     
  9. andreyg13

    andreyg13 Jr. VIP Jr. VIP

    Joined:
    Nov 13, 2009
    Messages:
    915
    Likes Received:
    1,774
    Occupation:
    SEO
    Location:
    http://seoshark.org
    Home Page:
    Yep, that would be it, check out my e-book for more details, itis shared in the Jr. Vip section :)
     
  10. Sara

    Sara Registered Member

    Joined:
    Dec 28, 2007
    Messages:
    98
    Likes Received:
    71
    Location:
    Australia
    Have you been using the Concatenate function?

    This allows you to join 2 or more strings together, and you could do over 1 million joins in a matter of seconds
     
  11. hally0301

    hally0301 Newbie

    Joined:
    Feb 20, 2011
    Messages:
    24
    Likes Received:
    0
    hey andrey
    that is a real cool tip for getting some more url's
    guess you could then put the url's in social bookmarking tool in the hope they would then get indexed and you get some extra value for your links.
     
  12. andreyg13

    andreyg13 Jr. VIP Jr. VIP

    Joined:
    Nov 13, 2009
    Messages:
    915
    Likes Received:
    1,774
    Occupation:
    SEO
    Location:
    http://seoshark.org
    Home Page:
    yes hally, that is correct :)