1. This website uses cookies to improve service and provide a tailored user experience. By using this site, you agree to this use. See our Cookie Policy.
    Dismiss Notice

Requests vs Selenium

Discussion in 'Black Hat SEO Tools' started by JackTheRooster, Feb 8, 2019.

Thread Status:
Not open for further replies.
  1. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    Hello all!

    I've been scripting for about a year, and have been doing it only via Python requests. It works, but I can't help but wonder if a headless browser is where it's at. The problem is that it uses a ton more resouyrces, so is more expensive to host.

    Is this a valid concern? Should I stick with requests? How do you guys do it?
     
  2. handmadebots

    handmadebots Senior Member

    Joined:
    Nov 8, 2012
    Messages:
    1,036
    Likes Received:
    235
    Always stick with requests when you can.
    Faster, less resource intensive, multithreading won't be an issue, etc.
    One downside of requests is that JS won't be interpreted, but it does in browser (selenium).
     
    • Thanks Thanks x 2
  3. boldrack

    boldrack Junior Member

    Joined:
    Apr 12, 2016
    Messages:
    136
    Likes Received:
    37
    Gender:
    Male
    Huh!? You can scrape a heavy javascript dependent site with request .. something like a casino site that does real time update.
    You need a headless browser like selenium.

    But in my case .. "requests" first .. selenium is a last resort.
     
    • Thanks Thanks x 1
  4. jamie3000

    jamie3000 Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 30, 2014
    Messages:
    3,150
    Likes Received:
    1,653
    Location:
    uk
    I agree with the dudes above. Requests first, render page only if you have to.
     
    • Thanks Thanks x 1
  5. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    Thank you! How do you deal with all the HTTP calls that the Javascript fires off? Often this will pull in info for the user experience. Sure many of them are useless in terms of your end goal, but is it important to emulate them? I.E. Will the site give a damn if you don't do all those useless (from your perspective) calls?

    I've noticed that quite a few sites have you send a post request with fingerprint info. Is it necessary to emulate this, or will the site simply assume you have Javascript turned off?
     
  6. handmadebots

    handmadebots Senior Member

    Joined:
    Nov 8, 2012
    Messages:
    1,036
    Likes Received:
    235
    It really depends on what your goal is.

    True, usually the auxiliary stuff is rendered, but not necessarily.
    The site could or could not give a damn about those useless request. In most cases, they don't give a damn, but I could build you a prototype website that cares about those,
    and if not done, get your blocked.

    For the fingerprinting, same as above, some care some don't. In this days, if JS is turned off into browser, most site functionality won't work and devs don't bother with that so much
    anymore since most devices can run a browser with JS.
     
    • Thanks Thanks x 1
  7. graham25s

    graham25s Regular Member

    Joined:
    May 8, 2010
    Messages:
    457
    Likes Received:
    84
    Gender:
    Male
    Location:
    Scotland
    Home Page:
    I used to only do requests, having to use fiddler to see what was getting passed was a pain if it was fairly elaborate, having to emulate what was sent etc, then i thought why not use a headless browser (as a browser is what sites want to see coming), not as fast as requests, but visually you can see what is going on better, plus it takes care of all the Javascript loading issues, i use:

    1) - https://www.katalon.com/resources-center/blog/katalon-automation-recorder/

    This addon records your actions on any site, then you can export in C#/Java/Python your recorded actions which is pretty neat and saves a bit of time.
     
    • Thanks Thanks x 1
  8. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    Yes, that's about what I figured. It seems that you don't have to be perfect, just close enough.
     
  9. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    Thank you for the Katalon recommendation. Will def check it out as exporting exact actions is awesome.
     
    • Thanks Thanks x 1
  10. 夢市片

    夢市片 Registered Member

    Joined:
    Aug 18, 2016
    Messages:
    52
    Likes Received:
    11
    Gender:
    Male
    Something I'm wondering, as I've been only using Selenium for now.
    To automate anything on YouTube, as I don't see the elements I want to play with rendered, does it mean that I won't be able to use requests?
     
  11. graham25s

    graham25s Regular Member

    Joined:
    May 8, 2010
    Messages:
    457
    Likes Received:
    84
    Gender:
    Male
    Location:
    Scotland
    Home Page:
  12. graham25s

    graham25s Regular Member

    Joined:
    May 8, 2010
    Messages:
    457
    Likes Received:
    84
    Gender:
    Male
    Location:
    Scotland
    Home Page:
    I have not automated YouTube yet, but i would say requests are definately out, too much JavaScript/Hidden frames going on, Selenium would be the best way i would say :)

    regards
     
  13. Gogol

    Gogol Jr. VIP Jr. VIP

    Joined:
    Sep 10, 2010
    Messages:
    5,290
    Likes Received:
    4,979
    Gender:
    Male
    Occupation:
    Programmer
    Location:
    Pale Blue Dot
    Home Page:
    I don't think the script taking up "lots of resources" is a valid concern. Hardware is cheaper than it ever was, generally. If the program utilizes too much memory, it might be a problem with the settings/code. Either there's some memory leak or there's too many threads.
     
  14. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    "Lots of resources" is relative. I agree, memory leaks should be plugged before throwing your hands up and getting more hardware.

    Emulating a browser is always going to take far more resources than simply doing a requests call. With requests, you can run hundreds of threads on a Raspberry Pi. With selenium, you're lucky to get 10.
     
  15. Gogol

    Gogol Jr. VIP Jr. VIP

    Joined:
    Sep 10, 2010
    Messages:
    5,290
    Likes Received:
    4,979
    Gender:
    Male
    Occupation:
    Programmer
    Location:
    Pale Blue Dot
    Home Page:
    Yupp agreed. This is why it is important to choose the right tool for the job. A hammer can break eggs but that's not always necessary. ;)
     
    • Thanks Thanks x 1
  16. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    I'm convinced. I'll go with a mixed approach because I've wasted enough time deciphering Javascript that's been minified trying to figure out wtf is going on.
     
    • Thanks Thanks x 1
  17. showmaker

    showmaker Newbie

    Joined:
    Mar 7, 2014
    Messages:
    31
    Likes Received:
    3
    Gender:
    Male
    Occupation:
    Developer, Internet Marketer
    There can be another possible solution, which is to write browser extension. Works like a charm
     
  18. 夢市片

    夢市片 Registered Member

    Joined:
    Aug 18, 2016
    Messages:
    52
    Likes Received:
    11
    Gender:
    Male
    Browser extensions would just take the same amount of resources as Selenium, right?
    Why adding an additional layer when Selenium could make the job?
     
  19. Gogol

    Gogol Jr. VIP Jr. VIP

    Joined:
    Sep 10, 2010
    Messages:
    5,290
    Likes Received:
    4,979
    Gender:
    Male
    Occupation:
    Programmer
    Location:
    Pale Blue Dot
    Home Page:
    Exactly. Browser extension might be an easy solution, but it's not an effective one imho.
     
  20. JackTheRooster

    JackTheRooster Newbie

    Joined:
    Apr 19, 2018
    Messages:
    49
    Likes Received:
    9
    Gender:
    Male
    You can't run extensions when Chrome is in headless mode. Also, you can't use authed proxies easily either.
     
Thread Status:
Not open for further replies.