1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Please Recommend Good Scrapers (Name/Company/Email/Phone/etc)

Discussion in 'Black Hat SEO Tools' started by lostkitten, Mar 25, 2013.

  1. lostkitten

    lostkitten Regular Member

    Joined:
    Dec 7, 2008
    Messages:
    296
    Likes Received:
    84
    I currently have MLS but need to know what other tools are out there that can scrape name/company/email/phone#. MLS doesn't quite do what I need.

    Thanks
     
  2. security

    security Junior Member

    Joined:
    Sep 6, 2011
    Messages:
    128
    Likes Received:
    54
    Occupation:
    Marketing & Technological Consultant.
    Location:
    Miami, FL.
    Describe what you need scraped. Anything can be made.
     
  3. lostkitten

    lostkitten Regular Member

    Joined:
    Dec 7, 2008
    Messages:
    296
    Likes Received:
    84
    Name/Company/Email/Phone/etc
     
  4. CenTex Hosting

    CenTex Hosting Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 8, 2009
    Messages:
    1,518
    Likes Received:
    576
    Gender:
    Male
    Occupation:
    Admin
    Location:
    Austin, TX
    Home Page:
    i would be interested in this as well. If it would scrap any site that i give it to get that info.
     
  5. YouFeelMeDawg?

    YouFeelMeDawg? BANNED BANNED

    Joined:
    Aug 10, 2011
    Messages:
    266
    Likes Received:
    371
    I would suggest you use scrapy it can pretty much scrape anything, it is basically a spider.
    Now if you want to just use a service that is based on scrapy , go to scrapinghub.com they have a nice service where you could make your own spiders,crawlers and scrape anything you want. The webpanel is pretty self explanatory, it shows you a video of how you can setup your own scrapers, even for non-techies as it has a point-and-click UI.
    It is the best custom scraper i know of
    http://scrapinghub.com/pricing.html
     
    • Thanks Thanks x 2
  6. lostkitten

    lostkitten Regular Member

    Joined:
    Dec 7, 2008
    Messages:
    296
    Likes Received:
    84
    ^^ It looks like that service and scrapy is for extracting data from a site.

    But what if I want to scrape THE INTERNETS? I want to scrape thousands upon thousands of sites online for

    -Names
    -Company
    -Email
    -Phone Numbers

    Can scrapy and scrapinghub do it?
     
  7. YouFeelMeDawg?

    YouFeelMeDawg? BANNED BANNED

    Joined:
    Aug 10, 2011
    Messages:
    266
    Likes Received:
    371
    Ofcourse it can, thats exactly what it does. You can scrape millions upon millions upon millions upon.... you get the point.

    It is a professional scraping service, you do however need to train the scraper in how to extract the data you want.

    When I meant a spider i actually meant to say is not just a simple scraper, but a spider engine. A spider by definition is a program that crawls the web , so what scrapy does is that it crawls the web and extracts data from all the links you tell it too, and you pretty much can specify literally any custom fields, data, html elements to extract from all the links it crawls.


    It is the most advanced open source python spider I know of.
    It will scrape way faster than ubot or zennoposter.
     
  8. lostkitten

    lostkitten Regular Member

    Joined:
    Dec 7, 2008
    Messages:
    296
    Likes Received:
    84
    But I have to feed it with links/urls right? But that's the very thing I want to get in the first place.
     
  9. YouFeelMeDawg?

    YouFeelMeDawg? BANNED BANNED

    Joined:
    Aug 10, 2011
    Messages:
    266
    Likes Received:
    371
    I don't think you completely understand what I am trying to say.
    Lets try a different way.

    The sites that you want to scrape for, do they have a format you want to get , name, company,email, phone etc .
    If so, then you need to build a scraper that recognizes what you want to scrape for.

    So if your only trying to pull text that matches something like this "Email:xxxxx@xxxx.xxx" where x's equal random characters that would be extracted if found next to "Email:" then you can do that pretty easily. Now if you want to take the data you want to extract from only the body, or an html element, then you can do that.

    That is what scrapy does, and scrapinghub has an interface where you point-n-click to where the data or how the data would appear in order to be extracted. You can't scrape data from links without telling the scraper how to extract it or from where to extract it.

    The links you feed are your starting point, or you can say give it one link from a directory or a database site and point out the format of how to extract the data you want and it will do it for you.Literally, it will take you time to do all that, customize the scraper, then once you run it you see if its working for you.

    Or if you have money, you can go with their professional service where you specify what you need and they will build you a custom scraper, it will be more expensive but hey time is money and some people rather outsource a professional scraper right?
     
  10. lostkitten

    lostkitten Regular Member

    Joined:
    Dec 7, 2008
    Messages:
    296
    Likes Received:
    84
    Got it. So there is a way.

    At this time, it looks as though there is learning curve to using scrapy and the service. At the time of replying and reading to this thread, I've already generated 2,000 contacts using MLS. It may not be perfect but I would have no choice at this time between the two since I know nothing about scrapy. I'd have to look into that closely in the next few weeks.
     
  11. security

    security Junior Member

    Joined:
    Sep 6, 2011
    Messages:
    128
    Likes Received:
    54
    Occupation:
    Marketing & Technological Consultant.
    Location:
    Miami, FL.
    Ubot baby..! it can do it a.