Web Scraping and Automated Web Crawling

Discussion in 'Black Hat SEO' started by bubbubber, Oct 5, 2014.

    If you take a look at my profile you can see a link to a Youtube video that shows an alternative method to web scraping and automated web crawling.

    Ask me any questions you have about the technology used.

    I believe that probably most folks that perform web scraping and web crawling use PHP and http protocol. In my method I use a web browser control embedded into a Windows form application. I used this method because I found that certain sites needed to think that an actual mouse cursor was moving around - and that their buttons were pressed by an actual mouse click - or else they would not respond. I found this to be true for some online betting sites, at the very least. Doing it my way, I am able to all Win32 API calls to move the mouse cursor and actuate mouse clicks and keyboard strokes.

    I know a lot of people will call BS on this and say that this is not necessary, but, based on my experiences with certain sites, it is a viable solution to the problem.

    This method is not meant to replace the more commonly used PHP web scraping/web crawling methods. My method is comparatively slow because it actually uses a web browser to bring up the each web page, whereas I assume PHP with HTTP protocol can navigate to many more web pages in the same amount of time. But, in certain situations, my method will be able to allow you into log into some sites that the PHP method just can't handle.

    Your first (second if you count your introduction, which you also advertised in) post and you're already advertising your YouTube video... :banghead: