Freebie: One Instance of Data Scraping As Service

Discussion in 'Freebies / Giveaways' started by g_bot, Oct 2, 2013.

  1. g_bot

    g_bot Newbie

    Oct 26, 2008
    Lately i have been working towards optimizing my data mining and other automation scripts both for speed and scalability.

    If anyone here in interested, i'll code scrapers to crawl the website(s) of your choice and provide you with data you want in structure you want.

    How to get started?

    Step One: You'd need to send me url of website your need scraped, few urls that host the content you need and sample of output data you want.
    Step Two: After going over some details I'll write the scapers, crawl few pages and provide you a sample set of structured data.
    Step Three: After your approval, I will kick start scrapers to fetch all the required data.

    There aren't any limits on output size or number of pages to scrape, but if your request seems excessive i'd notify you and limit the runtime/output size.

    This doesn't include any source code for the scrapers or other scripts, you get the data as one time download.

    At this stage i am not comfortable giving away the source code and i don't have time/resources to offer hosted solution for recurring crawlers.

    I am well aware this cuts out people looking for live feeds of fresh data, but i am sure there are many people who'd still need free stacks of well structured data.

    No payments will be accepted to provide source code or access to recurring scrappers.

    Delivery time will depend on requirements and my availability. I'd try to deliver within a week for most requests but there are no guarantees NO ETAs.

    What kind of data can be scraped?

    You can request tasks that require fetching data like articles from a website/blog or contact details of businesses that match your given criteria from a business listing website.

    I'd like to be challenged with complicated websites with:

    • Content locked behind captcha / auth sessions.
    • Seemingly no patterns to iterate over for scrapping different pages.
    • Limits over number of hits from one IP address/range.

    I have beaten them before and i'd do it for you too. No matter if your task seems simple or complicated send me the details and i'll do my best.

    What can you do with the data?


    Its upto you to choose what to do with the gathered data. Republish spun versiones / gather leads / give it away to your subscribers for all I care!
    Please refrain from asking any support/help from me for building websites/app from the data.

    Who owns the data/will you steal my billion dollar idea?

    Now some of you here might wonder what if g_bot here is setting up bait to steal your 'unique' / 'gazillion $$ idea'
    To answer that question, if you think your new business idea is worth a ton of money don't spill your beans to a guy offering free service on a marketing forum.
    For the paranoid parrots among us: i'll sign a NDA if you send me one. and pinky swear about not sharing the data with a third party.

    You own the data. Well sort of, we of-course collected it from public webpages but whatever its yours now, i wont sell it or distribute it for any purpose.

    Obvious Disclaimer:

    If you get sued or ninja assassinated for copyright violation, plagiarism, planing out world domination its on you!
    I hold the right to dismiss a request if my hands are full or any other reason.
  2. xShirase

    xShirase Newbie

    Jul 9, 2013
    Funny, i was thinking of posting the exact same thread and was browsing BHW to find where to do that :D
    Anyway, if anyone is interested, I offer the same service, same conditions (Although I accept money if you feel like giving, but I do it for knowledge first!).
    I have huge experience, I code scrapers on elance on a daily basis, so PM me only if your project has a twist, is particularly interesting, fun or difficult.
    I also can help aspiring scrapers and share my knowledge, feel free to come and talk!

    g_bot which language do you use? I was a big Ruby/Mechanize fan, but switched to PhantomJS for more awesomeness (cross-browsers screenshots, easy headers...). I might have some stuff for you, even small paid tasks if you feel like. PM?
  3. g_bot

    g_bot Newbie

    Oct 26, 2008
    Sorry an ongoing project kept me too busy to respond here any sooner.

    I am a relatively new player, currently my favorite tools for quick and dirty scrappers are Mechanize & BeautifulSoup on Python.

    Sending you a PM.