Looking for SERP content scraper

Discussion in 'Black Hat SEO' started by ghengis_khan, Oct 24, 2013.

  1. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    348
    Likes Received:
    537
    Location:
    mongolian plains
    Hi guys,

    I'm looking for a tool that will scrape text from websites. Either by scraping the SERPS or by manually entering URLS (since I can scrape urls myself)

    So basically the tool just needs to go to an URL, grab main body text, save.

    I know this sounds simple but I can't find anything that seems to be able to do this :/


    Please only reply if you know of an existing tool that can do this.
     
  2. stugz

    stugz Junior Member

    Joined:
    Apr 14, 2013
    Messages:
    154
    Likes Received:
    34
    Perl LWP is my favorite.

    But you can also use curl with PHP or C.

    I'm sure Python and Ruby will have simple solutions for this as well.
     
    • Thanks Thanks x 1
  3. dafoss

    dafoss Registered Member

    Joined:
    Aug 20, 2013
    Messages:
    76
    Likes Received:
    5
    Occupation:
    SEO Worker
    Location:
    New York, USA
    if you have zennoposter or ubot license then they could do it for you
     
    • Thanks Thanks x 1
  4. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    348
    Likes Received:
    537
    Location:
    mongolian plains
    Thanks for the suggestions but I'm looking for a ready-made tool to do this. Surely there must be something out there that can do this?
     
  5. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    348
    Likes Received:
    537
    Location:
    mongolian plains
    I have zennoposter but I'm really looking for something that can do this out of the box with minimal hassle/fuss without having to code/program something
     
  6. apoorv

    apoorv Regular Member

    Joined:
    Aug 31, 2011
    Messages:
    301
    Likes Received:
    62
    https://github.com/blackstonetech/pycurl/blob/master/examples/retriever-multi.py would work if you are okay with Python. It's a little bit of an overkill for something like this, but it's fast, serves the purpose, and it's already written. :D

    The usage is something like this:

    You can add proxies, etc. to this script, too, if you are scraping on a large scale. It's a few lines of code, let me know if you want me to add it?

    Also, doing something like this would be pretty simple in PHP, too. Something like this:

    This isn't very robust or very fast, but it will get the job done. I tested it out with a couple of URLs. You can also use cURL with PHP, that will be faster.
     
    • Thanks Thanks x 1
    Last edited: Oct 24, 2013
  7. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    348
    Likes Received:
    537
    Location:
    mongolian plains
    Thanks a lot for taking the time to help me with that. I'm still looking for an 'out of the box', point and click -> harvest content sollution though.
     
  8. hadoken

    hadoken Regular Member

    Joined:
    Dec 4, 2012
    Messages:
    300
    Likes Received:
    529
    Location:
    Toronto
    • Thanks Thanks x 1
  9. ghengis_khan

    ghengis_khan Regular Member

    Joined:
    Aug 17, 2010
    Messages:
    348
    Likes Received:
    537
    Location:
    mongolian plains