1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

how do you guys scrape thousands of blogs?

Discussion in 'Black Hat SEO' started by Bartman, Sep 12, 2011.

  1. Bartman

    Bartman Power Member

    Joined:
    Apr 24, 2010
    Messages:
    569
    Likes Received:
    131
    in scrapebox, when I try to scrape, I get about 800 blogs (after removing duplicate URLs)
    when I try to scrape again, I get more, but then after removing duplicate, I end up with the same 800 blogs.

    I do this 2 ways:
    search for something very generic, like John
    search for spammy comments, like "I bookmarked your site" or another comment that a spammer always uses.
    both ways, I only get about 800. I know there are a lot more blogs that contain those keywords, how come I am not getting them?
     
  2. mdsurf

    mdsurf Senior Member

    Joined:
    Jun 16, 2011
    Messages:
    990
    Likes Received:
    279
    Location:
    Las Vegas
    make sure you have the max number of urls to scrape to 1000 per keyword
     
  3. greatgreat

    greatgreat Newbie

    Joined:
    Sep 12, 2011
    Messages:
    15
    Likes Received:
    2
    Using elance to outsource the work has been great for me. Some good guys in Romania or India.
     
  4. Mokodoki

    Mokodoki Regular Member

    Joined:
    Feb 26, 2011
    Messages:
    217
    Likes Received:
    354
    Occupation:
    Graphic Artist | Fulltime Student
    Because you are just getting the first 1000 results from those keywords in which one site may dominate a large portion of those results (so you wind up with a lot of duplicates). You can improve the variety of domains that your search returns by adding additional keywords and footprints to your search. The more creative you are, the more results you can squeeze out of your keywords.

    Removing duplicate domains and searching for a generic term like "nice blogs", I only got about 220 unique domains. Mixing it with a keyword list (so you are harvesting blogs about X that have comment Y on them somewhere) gives me several thousand unique domains.

    Loopline also has a cool video that explains some extra google footprints you can use beyond the norm to get more results per keyword:

    http://www.youtube.com/watch?v=hKC69k_cRsc
     
    • Thanks Thanks x 1
  5. takers

    takers Regular Member

    Joined:
    Aug 10, 2011
    Messages:
    321
    Likes Received:
    303
    Location:
    Dont know
    great forum
     
  6. caavemaan

    caavemaan Registered Member

    Joined:
    May 15, 2010
    Messages:
    71
    Likes Received:
    27
    Occupation:
    ex-CSC employee
    LOTS AND LOTS AND LOTS of keywords. Then you just need the right footprint. Then ScrapeBox crashes because its at 14 million on the multi-threaded harvester and you try to abort but it errors out and you think crap I lost all that work............except you didn't. Thank god for the Harvester crash saver thing.


    :pirate:
     
    Last edited: Sep 12, 2011
  7. VIC SEO

    VIC SEO Elite Member

    Joined:
    Feb 19, 2010
    Messages:
    2,156
    Likes Received:
    363
    Gender:
    Male
    Occupation:
    SEO Specialist
    Location:
    iSynergyMedia
    Home Page:
    800 blogs are a lot of blogs, be thankful. I have my partner look at this forum and he still says that manual submission is the best. On a weekly basis we are getting only 50 blogs. Now compare that with your 800, looks like a lot now doesn't it?
     
  8. licorne101

    licorne101 Registered Member

    Joined:
    Aug 22, 2011
    Messages:
    88
    Likes Received:
    118
    Get a better list of keywords and footprints. Like some one above said, LOTS AND LOTS of them! 800 from a single search isn't really that bad. Then again, it depends on what you do with them.