1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox - Internal Java Browser

Discussion in 'Black Hat SEO Tools' started by jb2008, Jan 2, 2011.

  1. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    I've read in the forums that Scrapebox operates with its own internal Java based browser. I'm not a programmer so forgive me if this is a silly question but, why doesn't it use some kind of http mimic instead of making its own browser? Of course I know it would be stupid to use IE though it has its uses in slow poster, but what's wrong with using some kind of http mimic when querying?
     
  2. JesusBack

    JesusBack Executive VIP Premium Member

    Joined:
    Sep 15, 2010
    Messages:
    1,159
    Likes Received:
    1,284
    Occupation:
    Almost done :D
    Location:
    {calm|cool|collected}
    Because it's extremely close to a real browser and it'll have a higher success rate for js defenses among others.
     
  3. pyronaut

    pyronaut Executive VIP

    Joined:
    Dec 9, 2008
    Messages:
    1,229
    Likes Received:
    1,422
    I am really not sure if it is a true custom browser, or just an IE window in a new skin. But, from my own experiences with posting wordpress comments.

    There is a plugin called "Cookies For Comments", that gives you a cookie based on running a piece of javascript. HTTP requests or Sockets cannot execute javascript. Thus they will never gain the cookie, and can never post a comment.

    Using a browser based solution will execute the javascript and post the comment. Ofcourse it is slower, but it is a more thorough process.
     
  4. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    That's interesting pyro - it sounds plausible that a browser would be required to post comments if cookies are required.

    I am more concerned with the harvesting part of SB, however; why is a javascript browser required for queries to Google? Of course using the API is not recommended for many reasons, but a http mimic of some sort sounds like it could work? Or does G have some sort of defenses against this. I am fascinated by scrapebox and having seen many URL harvesters on the internet and how much they struggle at scraping, the seamless ease with which Scrapebox harvests is quite amazing. Given the difficulty all these other pieces of software have, I am left wondering how on earth SB does it!
     
  5. kalrudra

    kalrudra BANNED BANNED

    Joined:
    Oct 29, 2010
    Messages:
    271
    Likes Received:
    300
    I am C# programmer from 8 years...

    You doesn't require webbrowser to send comments !!

    You can do everything with http request response classes,

    But It's little complicated with javascript execution.. (you have to calculate manually what it does...)

    I don't think comment requires javascripts... If there is any javascript.. it's just for validation of input purpose, nothing more..

    my 2 cents.. :)
     
    Last edited: Jan 3, 2011
  6. cooooookies

    cooooookies Senior Member

    Joined:
    Oct 6, 2008
    Messages:
    1,008
    Likes Received:
    216
    Yep. I use HtmlUnit - java library for a headless browser. Maybe this is also used by scrapebox. Blog-commenting does not require js-execution. Therefore I disable it :)

    Even faster then relying on a headless browser is working directly with http requests - to be more specific to blindly to POST requests. But in some cases you want to parse the DOM-tree of html-tags to name the most important. I'd say there is a compromise between traffic and cpu usage by doing so.
     
  7. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    That's what I was thinking. So why does SB use a Java based browser rather than some kind of http request ?