1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Looking for SERP content scraper

Discussion in 'Black Hat SEO' started by ghengis_khan, Oct 24, 2013.

  1. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    339
    Likes Received:
    527
    Location:
    mongolian plains
    Hi guys,

    I'm looking for a tool that will scrape text from websites. Either by scraping the SERPS or by manually entering URLS (since I can scrape urls myself)

    So basically the tool just needs to go to an URL, grab main body text, save.

    I know this sounds simple but I can't find anything that seems to be able to do this :/


    Please only reply if you know of an existing tool that can do this.
     
  2. stugz

    stugz Junior Member

    Joined:
    Apr 14, 2013
    Messages:
    154
    Likes Received:
    33
    Perl LWP is my favorite.

    But you can also use curl with PHP or C.

    I'm sure Python and Ruby will have simple solutions for this as well.
     
    • Thanks Thanks x 1
  3. dafoss

    dafoss Registered Member

    Joined:
    Aug 20, 2013
    Messages:
    76
    Likes Received:
    5
    Occupation:
    SEO Worker
    Location:
    New York, USA
    if you have zennoposter or ubot license then they could do it for you
     
    • Thanks Thanks x 1
  4. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    339
    Likes Received:
    527
    Location:
    mongolian plains
    Thanks for the suggestions but I'm looking for a ready-made tool to do this. Surely there must be something out there that can do this?
     
  5. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    339
    Likes Received:
    527
    Location:
    mongolian plains
    I have zennoposter but I'm really looking for something that can do this out of the box with minimal hassle/fuss without having to code/program something
     
  6. apoorv

    apoorv Regular Member

    Joined:
    Aug 31, 2011
    Messages:
    301
    Likes Received:
    62
    https://github.com/blackstonetech/pycurl/blob/master/examples/retriever-multi.py would work if you are okay with Python. It's a little bit of an overkill for something like this, but it's fast, serves the purpose, and it's already written. :D

    The usage is something like this:

    You can add proxies, etc. to this script, too, if you are scraping on a large scale. It's a few lines of code, let me know if you want me to add it?

    Also, doing something like this would be pretty simple in PHP, too. Something like this:

    This isn't very robust or very fast, but it will get the job done. I tested it out with a couple of URLs. You can also use cURL with PHP, that will be faster.
     
    • Thanks Thanks x 1
    Last edited: Oct 24, 2013
  7. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    339
    Likes Received:
    527
    Location:
    mongolian plains
    Thanks a lot for taking the time to help me with that. I'm still looking for an 'out of the box', point and click -> harvest content sollution though.
     
  8. hadoken

    hadoken Regular Member

    Joined:
    Dec 4, 2012
    Messages:
    300
    Likes Received:
    529
    Location:
    Toronto
    • Thanks Thanks x 1
  9. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    339
    Likes Received:
    527
    Location:
    mongolian plains