Web scraping and data extraction

Discussion in 'General Programming Chat' started by confide, Oct 22, 2014.

  1. confide

    confide Newbie

    Joined:
    Jul 7, 2012
    Messages:
    5
    Likes Received:
    1
    What would be the best language(s) to do this in?

    Python? PHP? Java?
     
  2. MisterNick

    MisterNick Registered Member

    Joined:
    Oct 22, 2014
    Messages:
    80
    Likes Received:
    60
    Occupation:
    Programmer
    Location:
    Tbilisi,Georgia
    Any programming language has that kind of function.I'm using C# .NET if i want to scrape some data from web

    Good luck!
     
  3. Repulsor

    Repulsor Power Member

    Joined:
    Jun 11, 2013
    Messages:
    776
    Likes Received:
    280
    Location:
    PHP Scripting ;)
    I use PHP for web scraping. Its really handy. But PHP cant actually parse javascript content or else it would need high end coding to parse those.

    I also getting into Vb.NET, to approach it from the BOT way. Just learning though.
     
  4. Chris22

    Chris22 Regular Member

    Joined:
    Sep 29, 2010
    Messages:
    400
    Likes Received:
    1,063
    If you don't have programming experience then don't bother. Check out import.io for scraping.
     
  5. rootjazz

    rootjazz Jr. VIP Jr. VIP

    Joined:
    Dec 21, 2012
    Messages:
    922
    Likes Received:
    395
    Occupation:
    Developer
    Location:
    UK
    Home Page:
    The best language is the one you know.

    The language really doesn't matter. They all do the same thing more or less.

    Couple of if statements a few for loops, a class or two and you're there :)
     
  6. member8200

    member8200 Regular Member

    Joined:
    Aug 9, 2014
    Messages:
    475
    Likes Received:
    33
    Python, PHP, Java can all do the trick, there's no the best for the job.
    it doesn't matter what language you choose.
     
  7. elhefe

    elhefe Newbie

    Joined:
    Jul 9, 2013
    Messages:
    41
    Likes Received:
    4
    If you're asking this question, you probably want the language you're most familiar with.

    However if you're picking a language to learn and you really want to do serious scraping, don't pick one that will limit you in the future. You'll want a language that supports threading. While you could debate this, php and python are kinda out. Ruby might be a good way to go. Java could be a good way to go. What about using java and crawler4j? You'll never out grow that.