1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

C# Vs Python For Advanced Web Scraping?

Discussion in 'C, C++, C#' started by black_hat_newbie, Nov 20, 2014.

  1. black_hat_newbie

    black_hat_newbie Newbie

    Joined:
    Jan 12, 2009
    Messages:
    28
    Likes Received:
    1
    Occupation:
    IM & SEO
    Location:
    Internet
    hi guys,
    I want to be able to scrape google maps, google places, amazon, facebook and other websites like yellowpages, yelp, realtor.com, etc. and save the data in an excel sheet or database.

    Which would be a better one for me to learn? or both would be same and it wont matter?
     
  2. trance92071

    trance92071 Senior Member

    Joined:
    Nov 1, 2009
    Messages:
    950
    Likes Received:
    848
    Occupation:
    Internet Marketing
    Location:
    BoosterBots.com
    Home Page:
    • Thanks Thanks x 2
  3. PHPInjected

    PHPInjected Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 25, 2014
    Messages:
    2,037
    Likes Received:
    1,690
    Occupation:
    100% Unique Content Writer
    Location:
    Overriding Methods
    Home Page:
    Good luck. I would point you in the direction of python because it is more lightweight and flexible.
     
  4. black_hat_newbie

    black_hat_newbie Newbie

    Joined:
    Jan 12, 2009
    Messages:
    28
    Likes Received:
    1
    Occupation:
    IM & SEO
    Location:
    Internet
    thanks trance.

    Would c# be able to do the same things efficiently and does c# also have good scraping libraries like python?
     
  5. mktanny

    mktanny Regular Member

    Joined:
    Oct 22, 2009
    Messages:
    225
    Likes Received:
    62
    Occupation:
    Blog editor and IM
    Like trance92071 , python is easier to learn and cuts the production time a lot and like most languages has good community support as well.
     
    • Thanks Thanks x 1
  6. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    +1 for Python. There are already many libraries and code snippets available related to scraping and data collection.
     
    • Thanks Thanks x 1
  7. DarkPixel

    DarkPixel Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 4, 2011
    Messages:
    1,328
    Likes Received:
    1,239
    Location:
    ↓↓↓↓
    Home Page:
    Python is great for what you want to do, although for GUI apps, C# has a clear advantage.
     
    • Thanks Thanks x 1
  8. jamie3000

    jamie3000 Senior Member Premium Member

    Joined:
    Jun 30, 2014
    Messages:
    1,072
    Likes Received:
    468
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    Ive been writing scrapers and small bots over the last few weeks. I've found you should decide what scraping tools you want to use first then look at what language you can use with it. C# has htmlagility pack which a few people use for twitter bots etc. I'm currently using a combination of selenium and phantomjs with php/python/casperjs. Add me on Skype if you want to discuss mate :) jmcostello21
     
    • Thanks Thanks x 1
  9. black_hat_newbie

    black_hat_newbie Newbie

    Joined:
    Jan 12, 2009
    Messages:
    28
    Likes Received:
    1
    Occupation:
    IM & SEO
    Location:
    Internet
    If the learning curve is fine for me for c#, does c# has as good libraries as python for advanced web scraping?
    is htmlagility as good as the ones that are available for python.

    I was considering c# if it would let me do advanced scraping and at the same time build it into desktop apps easily. But only if it can do advanced webscraping well. If not as easily as python but with medium ease. If c# is going to be a pain and a lot lot more effort, then i feel python would be better.

    I wanted to learn a proper programming language too which is systematic has graphical UI and powerful if it can be used for advanced we scraping well.

    What's your take?
     
  10. Diplomat

    Diplomat Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 25, 2011
    Messages:
    872
    Likes Received:
    410
    Occupation:
    CEO
    • Thanks Thanks x 1
  11. jamie3000

    jamie3000 Senior Member Premium Member

    Joined:
    Jun 30, 2014
    Messages:
    1,072
    Likes Received:
    468
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    Do you want to just scrape and parse or scrape and render pages? When I say render I mean execute JavaScript and display CSS.
     
  12. black_hat_newbie

    black_hat_newbie Newbie

    Joined:
    Jan 12, 2009
    Messages:
    28
    Likes Received:
    1
    Occupation:
    IM & SEO
    Location:
    Internet
    at the moment, I just need to store the data in excel/database. But in the future might need to display it.

    can you do advanced web scraping with c# well without much pain like you do in python?
     
  13. macdonjo3

    macdonjo3 Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 8, 2009
    Messages:
    5,560
    Likes Received:
    4,317
    Location:
    Toronto
    Home Page:
    You're comparing 2 languages with great support.

    Python is interpreted and C# is compiled.

    Python will be quicker to code, but C# is easier to protect.

    So, Python for personal jobs, C# for something you plan to distribute.

    Just a few comparisons. So if you're just scraping for your own good, Python is the clear winner.
     
    • Thanks Thanks x 1
  14. Archemike

    Archemike Regular Member

    Joined:
    Jan 12, 2012
    Messages:
    221
    Likes Received:
    42
    I myself use Node.js and Ruby. It depends on your goals. With a site like import.io you can scrape without really knowing how to code. If you're learning to develop the skill, unless the site is advanced at thwarting scraping you can use a high level language. This of course would be a scripting language; so yes Python with Beautiful Soup or Scrapy would be great ways to quickly write something where you can buy a proxy on a server, run your script on schedule with a cron, and get regular data with minimal maintenace. I like csv/json. Have fun!
     
    • Thanks Thanks x 1
  15. pjnlsn

    pjnlsn Newbie

    Joined:
    Jul 13, 2011
    Messages:
    11
    Likes Received:
    1
    C# is compiled and thus highly unnecessary for parsing/scraping. Learn python, it will be faster to learn and well coded python is actually pretty good performance if at some point down the line you need speed. (You won't until and if you need to scrape thousands of sites every minute or so)
     
  16. satyawrat

    satyawrat Jr. VIP Jr. VIP

    Joined:
    Jul 8, 2009
    Messages:
    922
    Likes Received:
    1,181
    Occupation:
    Hustler
    Location:
    Gurgaon
    Home Page:
  17. Heinous

    Heinous Newbie

    Joined:
    Feb 20, 2015
    Messages:
    6
    Likes Received:
    0
    Having pretty good knowledge of both Python and C#, I'd definitely go for C#. Python will be great for small projects, but if you want a language you can develop further then C# is the way. I also found C# easier to learn than Python, but maybe that's just me
     
  18. Elderio

    Elderio Newbie

    Joined:
    Feb 27, 2015
    Messages:
    7
    Likes Received:
    2
    I've used both C# and Python for scraping projects (and exporting them to excel). Both languages are more than capable for the advanced scraping you speak of. So, I'd say you may need to think about other factors to determine which one you choose. Python, in general, cuts down production time by quite a lot compared to C#. Both languages have their different libraries you can use for scraping, so you may want to consider which library is easier for you to learn. You may also wish to think of the long-term, perhaps you want to get hired as a programmer one day. Currently, both languages are in demand, however this, mostly, depends on your local area. Both C# and Python have a great community. I'm sure there are other factors you may wish to take into consideration too. I hope that has helped somewhat.

    I'd research and explore both languages before making a concrete choice.

    P.S: If you're wondering which language I personally prefer then it's mostly C# for bigger and stable projects, and Python for quicker and prototyping. However, bare in mind that's just my personal usage and opinion.
     
  19. sohom

    sohom Senior Member

    Joined:
    May 26, 2013
    Messages:
    981
    Likes Received:
    175
    Location:
    not in Past
    Python
    Selenium & Mechanize & Beautifulsoup under Python will help you to manage everything
     
  20. opahopa

    opahopa Jr. VIP Jr. VIP Premium Member

    Joined:
    Sep 23, 2012
    Messages:
    142
    Likes Received:
    9
    as for me, python is better for the scraping tasks, cause u`ll spend less time making scraping soft in python