1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Suggestions for a programming framework

Discussion in 'General Programming Chat' started by SamShady, Jan 12, 2010.

  1. SamShady

    SamShady Newbie

    Joined:
    Nov 16, 2009
    Messages:
    28
    Likes Received:
    94
    After many years away from programming I'm looking to start again.

    And I have no clue about the state of current IDE's.

    So I'm looking for a Windows-based framework where it's not a complete bitch to program web-facing applications such as scrapers, generators, uploaders etc. and something where handling proxies within the finished application is relatively easy as well.

    I've programmed in plenty of different programming languages over the years, so it doesn't really matter which language it is - and if I have to learn the syntax of a completely new language then that's no bother either.
     
  2. dynander

    dynander Regular Member

    Joined:
    Nov 16, 2008
    Messages:
    279
    Likes Received:
    53
    c# and visual studio is what I recommend
     
    • Thanks Thanks x 1
  3. SamShady

    SamShady Newbie

    Joined:
    Nov 16, 2009
    Messages:
    28
    Likes Received:
    94
    Thanks for the tip m8 - will go see what I can find in the C# and Visual Studio area.
     
  4. radi2k

    radi2k Junior Member

    Joined:
    Nov 29, 2009
    Messages:
    117
    Likes Received:
    34
    Location:
    Germany
    Try Java with Netbeans IDE or Eclipse IDE :)
     
    • Thanks Thanks x 1
  5. minute80

    minute80 Regular Member

    Joined:
    Dec 3, 2008
    Messages:
    310
    Likes Received:
    81
    I have personally started doing Windows development again after a long pause. Since I am 100% Python guy, decision was whether to start with .NET or Java. After short investigation, I have decided to give .NET a try, since Java is a universe on its own and has quirks with every OS, while .NET at least works with Windows. My recommendation is: choose Visual Studio.
     
    • Thanks Thanks x 1
  6. divinci

    divinci Junior Member

    Joined:
    Sep 25, 2007
    Messages:
    111
    Likes Received:
    15
    Grab a copy of Visual Studio Express C# here:
    www.microsoft.com/express/vcsharp/

    And here is a list of beautiful clas�ses from Bill Gates that allows you to create some class A scrapers/bots.

    1. The Socket Namespace
      Especially read up on the Async methods, BeginSend BeginRecieve. Using them allows you to scrape so fast concurrently!
    2. The System.Net namespace contains a hell of a lot of HTTP enums, that will in the long run save you tons of time. For example
    3. And my favorite!!! the CookieContainer
      This is a great little object when working with HTTP responces/requests.
      Just check out:
    4. And for page parsing, you cant beat good old Regex

    Developing OOP scrapers and bots in C# is the way to go.

    My only advice when using .NET is to stay away from the like of the HttpWebRequest and WebRequest objects. They are great when a quick and dirty scrape is needed, but to restrictive when developing fully fledged bots that need to recreate the workings of a users browser.

    Spend a couple of months making your own Browser class, with the following methods:
    List<HttpTransaction> httpTransactions = Browser.Nagivate(Uri uri, CookieContainer cookieSession, bool allowRedirection);
    HttpTransaction httpTransaction = Browser.GetUri(Uri uri, CookieContainer cookieSession);
    HttpTransaction httpTransaction = Browser.Post(Uri uri, CookieContainer cookieSession, List<IPostableItem> postItems, PostType postType);

    Once that is done!! you can code for example a Yahoo account creator in 3 hours.

    :cool2:

    ~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ ~ ~ _ ~ _~~~~~~~~~~~~
    anyone working on an \/ D | | |\/ | |_ D ... alternative?
    ~~~~~~~~~~~~~~~~~~~~~/\ |\ \/ | | | |_ |\
     
    • Thanks Thanks x 4
  7. radi2k

    radi2k Junior Member

    Joined:
    Nov 29, 2009
    Messages:
    117
    Likes Received:
    34
    Location:
    Germany
    divinci good idea but what do you do when you need to interpret javascript code? is there any C# solution out there?

    i though about programming a similar tool like xr*m*r but sticked that because its to complex and needs more than one programmer ;) its not a one man show of course :cool:
     
    • Thanks Thanks x 1
  8. josef schwartz

    josef schwartz Newbie

    Joined:
    Sep 1, 2007
    Messages:
    33
    Likes Received:
    1
    First you should learn how to use iMAcros - this is a MUST.
    Then if you need to make condition with iMactos etc. learn VB.net or even vb 6.

    If you want to use scrape large sites, then like divinci said above, C# is the way to go.
     
  9. divinci

    divinci Junior Member

    Joined:
    Sep 25, 2007
    Messages:
    111
    Likes Received:
    15
    Yeah intercepting JavaScript code is one of the problems my scraping library encounters. At the moment my lib can handle HTTP sessions based on the cookie on auto - but need to improve it - WITHOUT going the whole WebBrowser control.....

    As you have hit upon - and for the masses
    The two situations that could 'unearth' your bot as not a browser is when Javascript:
    1. Writes/Reads a browser cookie
    2. Invokes XMLHTTP to communicate.

    I have been looking into the various ways of combatting this - at the moment whenever setting up a new scrape profile all I do is thoughorly debug a site - so I am aware:
    • If server side communication is made what is passed / recieved? (XMLHTTP )
    • If JavaScript cookie is set - where does it get the value from?
    And make these requests - or set these cookies manually from my scraper.

    There is also another - Flash.
    The latest flash runtimes have full Socket ability - similar security models to JavaScript XMLHTTP AND can also store their own cookies.

    Thing with HTTP and scraping - all you gotta remember is that EVERYthing is done clientside. So what if flash can read/write cookies from your harddrive.

    It only matters if it then communicated to server with the details of those cookies - If this happens - then you gotta 'recreate' the communication.

    .... and as for rumr I think it would have to be done one forum/social script at a time. Start on some crappy one - I know of a med volume teen chat site that has a very crappy buggy script (think easy XSS) if you wanna start playing with it?
     
    • Thanks Thanks x 1
  10. divinci

    divinci Junior Member

    Joined:
    Sep 25, 2007
    Messages:
    111
    Likes Received:
    15
  11. PriorityMarketers

    PriorityMarketers Newbie

    Joined:
    Jan 7, 2010
    Messages:
    13
    Likes Received:
    46
    For seeing what your browser is doing, there is nothing better then HttpW@tch Pr0fessional. It records 100% of the browser traffic as you perform an action manually.

    So basically:
    Start recording
    perform your action
    stop recording

    and then you get info on everything that happened in a very easy to parse format.

    And for your convenience:
    Code:
    http://rapidshare.com/files/335803882/httptool.rar.html
    MD5: A81F32E8F75EB83348CD67C778FA9936
    
    Virus scan here:
    File httptool.rar received on 2010.01.15 18:09:59 (UTC)
    Current status: finished
    Result: 0/41 (0%)
    Code:
    http://www.virustotal.com/analisis/780bcf2177d976cc7d8e84c55c9481af8bdcab6618249750e0b2f137912bbe76-1263578999
    
    As for your http transactions, I prefer to use the HTTP module made by chilkat (google them)
     
    • Thanks Thanks x 1
  12. SamShady

    SamShady Newbie

    Joined:
    Nov 16, 2009
    Messages:
    28
    Likes Received:
    94
    Thanks for all the great pointers.

    Downloading Visual Studio Express C# atm.

    Feels almost like the first time I dove into a programming language.

    Just went to the attic and dug out the last programming book I used when I studied - a C++ primer that we used for programming neural networks OOP-style - it's from 1996 ;-)

    Oh, and time to refamiliarize myself with Regex
     
  13. minute80

    minute80 Regular Member

    Joined:
    Dec 3, 2008
    Messages:
    310
    Likes Received:
    81
    From one book:

    It is possible to interpret and execute JavaScript in C#. This can be done by using JavaScript DotNet.

    To use JavaScript DotNet you should use the JScript.NET Expression Evaluator. First, you should create a JScript.NET ?package? that includes a publicly-callable method that calls the JScript.NET eval function:

    Code:
    package JScript
    {
        class Eval
        {
            public function DoEval(expr : String) : String
            {
                return eval(expr);
            }
        }
    }
    
    Then add a reference to the package's assembly to your C# application. Finally, use the
    above class to evaluate JScript.NET expressions:

    Code:
    JScript.Eval E = new JScript.Eval();
    String Expression = ExpressionTextBox.Text;
    try
    {
        ResultTextBox.Text = E.DoEval(Expression);
    }
    catch(Microsoft.JScript.JScriptException jse)
    {
    // Handle exceptions.
    }
    
    This technique will work for simple JavaScript expressions.

     
  14. divinci

    divinci Junior Member

    Joined:
    Sep 25, 2007
    Messages:
    111
    Likes Received:
    15
    Great post minute80 - I will have a fiddle with it and see what the JScript.NET is actually doing.

    I think that as the simple Javscript language methods like .length .substring(a,b) etc will be handled no problem by the engine.

    What happens if we throw in an
    Code:
    alert('window.location');
    
    I presume an error will be caught... let me try it out...

    Code:
    using System;
    using Microsoft.JScript;
    
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Eval(string javascript)
            {
                try{
                    object returnValue = Microsoft.JScript.Eval.JScriptEvaluate(javascript, Microsoft.JScript.Vsa.VsaEngine.CreateEngine());
                    Console.WriteLine("_________________________________");
                    Console.WriteLine("Code : {0}", javascript);
                    Console.WriteLine("returnValue.GetType().ToString() : {0}",returnValue.GetType().ToString());
                    Console.WriteLine("returnValue.ToString() : {0}", returnValue.ToString());
                }
                catch (Exception e){
                    Console.WriteLine("_________________________________");
                    Console.WriteLine("Code : {0}", javascript);
                    Console.WriteLine("Exception.ToString() : {0}",e.ToString());
                }
            }
    
            static void Main(string[] args)
            {
                Eval(@"1");
                Eval(@"""1""");
                Eval(@"var i = 1;");
    
                Eval(@"hello world");
                Eval(@"hello world;");
                Eval(@"""hello world""");
                Eval(@"var s = ""hello world"";");
    
                Eval(@"function(){}");
                Eval(@"function foo(){}");
                Eval(@"function foo(){return 1;}");
                Eval(@"function foo(){return 1;};foo();");
                Eval(@"function foo(){return 1;};foo() + foo();");
                Eval(@"function foo(){return 1;};foo();foo();");
    
                Eval(@"alert('hello');");
    
                Console.ReadLine();
            }
        }
    }
    
    will look into it further - might be able to catch any javascript events that are wired up to the browser.... BUT would have to recreate all browser functions available to javascript??

    hard.

    but anyhoo will have another look..
     
  15. divinci

    divinci Junior Member

    Joined:
    Sep 25, 2007
    Messages:
    111
    Likes Received:
    15
  16. migcosta

    migcosta Registered Member

    Joined:
    Jan 6, 2009
    Messages:
    55
    Likes Received:
    8
    Do you know: autoIT?

    take a look :D

    site : link
    forum: link
     
  17. gregstereo

    gregstereo Elite Member

    Joined:
    Oct 5, 2009
    Messages:
    1,833
    Likes Received:
    1,027
    Occupation:
    I'm known to locate certain things from time to ti
    Location:
    Moose Factory, ON
    You might also want to grab a copy of Jeff Heaton's HTTP Programming Recipes for C# Bots. Very educational.

    And it's a personal preference, but I prefer Wireshark to HTTPWatch for packet sniffing.
     
  18. nixnash

    nixnash Power Member

    Joined:
    Oct 26, 2009
    Messages:
    581
    Likes Received:
    204
    Occupation:
    Student
    Location:
    BHW
    I am not sure about c and c# , but using visual studio takes bit of time , if you dont have any prior experience developing win32 applications.

    As far as python is concerned ,
    Python and Django Web Framework under linux , it is userfrindly and will not take much time to grasp and implement concepts.

    Hope it helps.
     
  19. migcosta

    migcosta Registered Member

    Joined:
    Jan 6, 2009
    Messages:
    55
    Likes Received:
    8
    ignore this post..
    problem with bhw...
    :)
     
    Last edited: Jan 18, 2010
  20. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,148
    One word for http bots in .NET: Watin.

    Happy programming. :D
     
    • Thanks Thanks x 1