1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[technical question] How do existing tools access websites without browsers?

Discussion in 'Social Networking Sites' started by keryanJames, Jul 23, 2014.

  1. keryanJames

    keryanJames Registered Member

    Joined:
    Jul 24, 2013
    Messages:
    93
    Likes Received:
    18
    Hi guys!

    I was wondering how tools like Scrapebox can gather information for websites like Google. Normally I think that this kind of tools should behave like browsers. I know that in Java you can use HtmlUnit (which is not perfect for scraping or accessing websites if we think in terms of BlackHat) but I do not know what it is used by C (or C++, C#) programs.

    On conclusion, could you please give some hints concerning the current architecture needed to build a scraping / social wetwork software?
     
  2. Tanckom

    Tanckom Power Member

    Joined:
    May 4, 2014
    Messages:
    569
    Likes Received:
    172
    Location:
    ☯ Karma ☯
    Home Page:
    Browser are softwares created on visual effects, that means to give the user an visual window to create actions and see what the actions has done. Those programms like scrapebox could also show you a visual browser, but that would take a lot of Memory and CPU, and it would suck a bit, that's why there is no window, you can add to every bot a window that shows the internet actions performed by the bot
     
  3. keryanJames

    keryanJames Registered Member

    Joined:
    Jul 24, 2013
    Messages:
    93
    Likes Received:
    18
    I understand, but then they use some specific API to access websites, don't they? I am curious to know which ones are the most effective depending on the programming language.
     
  4. DarkPixel

    DarkPixel Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 4, 2011
    Messages:
    1,328
    Likes Received:
    1,239
    Location:
    ↓↓↓↓
    Home Page:
    Bots like scrapebox use sockets/webrequests.


    Sockets: direct communication with the website. Sockets send info, receive info. Done.

    Browsers: direct communication with the website, renders the result, executes javascript, shows images, does more things in the background.