Evading Selenium Detection: The Ultimate Guide

What guide do you want to see next


  • Total voters
    75

thebotmaker

Jr. VIP
Jr. VIP
Joined
Oct 11, 2018
Messages
909
Reaction score
904
Website
thebotmaker.mysellix.io

Evading Selenium Detection: The Ultimate Guide​


previous guide
https://www.blackhatworld.com/seo/t...lenium-for-python-browser-automation.1462476/


Evading detection by Selenium involves techniques to make automated web browsing sessions appear more like those of a genuine user. Selenium, a popular tool for automating web browsers, is often detected by websites looking to block bots or automated scripts for various reasons, such as preventing web scraping, automated testing, or automated purchasing.


Selenium is, rather old by definition, but it still has its uses for certain scraping tasks. You can do things with its API you cant easily do with puppeteer or playwright as they used another protocol.


Understanding Detection Methods​


Before diving into evasion techniques, it's crucial to understand how websites detect Selenium-driven browsers. Detection methods may include:


  1. JavaScript Tests: Websites can run JavaScript code to check for properties or behaviors typical of Selenium-controlled browsers.
  2. WebDriver Property: Browsers controlled by Selenium might have a specific navigator.webdriver property set to true.
  3. Browser Fingerprinting: Advanced techniques that analyze browser attributes like fonts, plugins, and even rendering behaviors.
  4. Unusual Interaction Patterns: Rapid, repetitive, or non-humanlike interactions with the website.

Evasion Techniques​


1. Modifying WebDriver Properties - Simple and well known



2. User-Agent Spoofing


  • Changing the User-Agent String: Mimic a real browser session by changing the browser's User-Agent string to avoid detection based on known automation tool signatures.
  • Thoughts on this - this is rather basic, but make sure you are using a recent user agent. https://www.useragents.me/
Code:
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", "whatever you want")
driver = webdriver.Firefox(profile)

3. Handling JavaScript Tests


  • Custom Scripts: Inject scripts that alter or mask JavaScript properties and methods typically used to detect automation.

Replacing CDC string

Code:
“Basically, the way the Selenium detection works, is that they testfor predefined JavaScript variables which appear when running withSelenium. The bot detection scripts usually look anything containingword "selenium" / "webdriver" in any of the variables (on windowobject), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.”


https://stackoverflow.com/questions...-selenium-with-chromedriver/41220267#41220267


My thoughts


I don’t think this is all that useful as websites have other ways for detecting selenium. Nevertheless, I have included it.


4. Randomizing Interaction Patterns


  • Simulating Human Behavior: Use random delays, mouse movements, and scroll actions to mimic human interactions.

Sleep function

Code:
import random
import time



def random_sleep(min_seconds, max_seconds):
"""Function to sleep for a random duration between min_seconds and max_seconds."""
sleep_duration = random.uniform(min_seconds, max_seconds)
print(f"Sleeping for {sleep_duration:.2f} seconds...")
time.sleep(sleep_duration)

My thoughts


Definitely useful, websites have rate limiting in place. You want to blend in as much as possible, doing too much too quickly is suspicious.


5. Using Headless Browsers Wisely


  • Avoid or Customize Headless Mode: Headless browsers are easily detected; if used, customize their options to mimic a non-headless browser, or better yet do not use a headless browser at all.

My thoughts


It’s well known how easy it is to detect headless browsers. Better use a headful browser or try to automate the requests of the website.


6. Proxy Rotation and IP Management


  • Rotating Proxies: Use multiple IP addresses to avoid rate limits and IP bans that suggest automated access.

https://www.blackhatworld.com/seo/f...automated-in-10-seconds-with-1-click.1252694/


https://www.blackhatworld.com/seo/quick-guide-proxy-basics-101-for-noobs.1124640/


Thoughts - there are various types of proxies. If I’m honest, residential proxies are enough for social media botting.


7. WebGL and Canvas Fingerprinting



8. Employing Browser Automation Frameworks


  • Tools like Puppeteer with stealth plugins or Playwright offer more sophisticated methods for mimicking genuine browser behavior.

https://pptr.dev/
https://playwright.dev/

Undetected chromedriver​


https://github.com/ultrafunkamsterdam/undetected-chromedriver

Its not perfect, I wouldnt use it in production for big tasks.
But for the time being, it's a fantastic tool; all you have to do is:

pip install undetected-chromedriver


Then, just as you would with regular selenium, work with it:
Code:
import undetected_chromedriver.v2 as uc

options = uc.ChromeOptions()


# setting profile

options.user_data_dir = "c:\\\\temp\\\\profile"


# another way to set profile is the below (which takes precedence if both variants are used

options.add_argument('--user-data-dir=c:\\\\temp\\\\profile2')


# just some options passing in to skip annoying popups

options.add_argument('--no-first-run --no-service-autorun --password-store=basic')

driver = uc.Chrome(options=options)


with driver:

    driver.get('<https://nowsecure.nl>')  # known url using cloudflare's "under attack mode"

Pro Tip


This chromedriver isn't a magic bullet for hiding all your bots!


You'll still need to follow the other tips in this article, such as a realistic page flow, changing your IP, not using a headless browser, cookies, and so on.
However, if you are using Python, this is a great place to start in order to keep your bot undetected, at least on the technical chromedriver site.


Anti detect browsers

Anti-detect browsers are specialized web browsers designed to minimize or entirely avoid detection by websites, web servers, and online services. These browsers are primarily used for privacy, security, and anonymity purposes. Here's how they work and why they're used:


  1. Evasion of Tracking and Profiling: Anti-detect browsers employ various techniques to avoid being tracked or profiled by websites and online services. This includes blocking cookies, preventing browser fingerprinting, and disabling features that can be used to uniquely identify the user, such as JavaScript or WebGL.
  2. Anonymity and Privacy: These browsers often route web traffic through proxy servers or VPNs to mask the user's IP address and location. By doing so, they help users maintain their anonymity and privacy while browsing the internet.
  3. Circumventing Restrictions: Anti-detect browsers can be used to bypass geo-blocks, censorship, or access restrictions imposed by websites or governments. By hiding the user's true location and identity, they enable access to content that may be restricted in certain regions.
  4. Security: Some anti-detect browsers incorporate security features to protect users from online threats such as malware, phishing attacks, and tracking scripts. They may include built-in ad blockers, anti-tracking tools, and encryption features to enhance security while browsing.

I’ll be real, use one of these in order to save time.


Dolphin Browser

incogniton

Linken Sphere



Many more, these are all that come to mind at the moment


Phone based automation


For social media, not worth it at scale unless you plan to employ a phone farm
May have a guide on this at some point, Android only ;)


Implementation Tips​


  • Stay Updated: Detection methods and evasion techniques evolve rapidly. Regularly update your methods and tools.
  • Test Regularly: Regularly test your setup against detection tools like Bot Detection services or CAPTCHA challenges. - https://realpython.com/python-testing/ so writing tests, expecting your bot to go to a certain page and see something on the page for the test to count as true. This would mean the website is not blocking you for instance.
  • I never used to use tests, if I did, they would have accelerated my workflow. But I didnt’ see the point!
  • Real world example .Take a look here https://www.seleniumeasy.com/python/selenium-webdriver-unittest-example
Code:
For a modern bot to have any chance of avoiding detection, it needs to be written using Puppeteer or Playwright and use evasion libraries. You'll need to handle browser fingerprinting with a service such as Multilogin and you'll need to simulate realistic human behavior. All of this is a huge endeavor, but doable if you're willing to put in the time. Learn in this order:



    JavaScript
    Node.js
    Puppeteer


If you need to hide the fact that you're a bot, then you'll need to continue with:



    Evasion libraries, such as puppeteer-extra-plugin-stealth
    Human behavior libraries, such as ghost-cursor


If you need to appear as though you are many different users, then also learn about:



    Browser fingerprinting


    Python (general purpose language)
    Django (web framework)
    MySQL (database language)
    Selenium / PhantomJS (web driver for headless automation)
    CSS / HMTL / JS (front end)
    Linux

https://www.blackhatworld.com/seo/h...arn-browser-automation.1311931/#post-14251266


Resources​


  • StackOverflow: A treasure trove for specific coding questions and solutions. Search for Selenium detection evasion techniques.
  • GitHub: Look for repositories related to Selenium, Puppeteer, and Playwright for community-developed evasion plugins or scripts.
  • Browser Automation Framework Documentation: Refer to official documentation of Selenium, Puppeteer, and Playwright for updates on evasion techniques.

Conclusion​


Evading Selenium detection requires a multi-faceted approach. The landscape of web automation and detection is constantly evolving, necessitating ongoing research and adaptation.


Not worth the time making selenium undetetactable. Buy a bot that is already out there, use an anti detect browser, or use the private API of the site your are targeting, or better yet, manual labor.


Radical Resource list - take what you will
Code:
Bulk Downloader for Reddit



https://github.com/aliparlakci/bulk-downloader-for-reddit



AutoX



https://github.com/kkevsekk1/AutoX



YouTube Viewer



https://github.com/MShawon/YouTube-Viewer



Public APIs



https://github.com/public-apis/public-apis#health



GoRod - Web Automation Library



https://go-rod.github.io/#/get-started/README



Awesome Remote Job



https://github.com/lukasz-madon/awesome-remote-job#others



Book: "Hands-On Enterprise Automation with Python"



https://www.amazon.com/gp/product/1491985046/ref=as_li_ss_tl?ie=UTF8&linkCode=sl1&tag=piprogramming-20&linkId=f570c87f11328e8906142cdd1db1acaf



Node Worker Farm



https://github.com/rvagg/node-worker-farm



FakeBrowser


https://github.com/kkoooqq/fakebrowser



Scrapy Cloudflare Middleware



https://github.com/clemfromspace/scrapy-cloudflare-middleware


Puppeteer Extra - Beginner's Guide



https://github.com/berstend/puppeteer-extra/wiki/Beginner:-I'm-new-to-scraping-and-being-blocked#youre-using-selenium



TikTok Puzzle Captcha Solver



https://github.com/toandev95/tiktok-puzzle-captcha-solver/blob/main/example.js



Python Readability

https://github.com/buriy/python-readability


Reddit Video Maker



https://github.com/elebumm/RedditVideoMakerBot



Captcha Harvester



https://github.com/NoahCardoza/CaptchaHarvester



Async/Await Talk by Wes Bos



https://wesbos.github.io/Async-Await-Talk/#1



Reverse Engineering APIs from Android Apps



https://medium.com/@thomas_shone/reverse-engineering-apis-from-android-apps-part-1-ea3d07b2a6c



Telegram Bot Deployment with AWS Lambda



https://hackernoon.com/how-to-easily-deploy-telegram-bot-using-aws-lambda-kj6e35c4



YouTube Viewer



https://github.com/MShawon/YouTube-Viewer



Telegram Bot Video Sales System



https://www.blackhatworld.com/seo/method-my-video-bot-100k-per-month-global-post-pandemic-strategy-email-seo-video-sales-system-newbie-friendly-advanced-tutorial-2021.1297690/



Medium Article: "Using Node.js, Puppeteer, and ElectronJS to Create a Web Scraping App with a Desktop Interface"



https://medium.com/@alexanderruma/using-node-js-puppeteer-and-electronjs-to-create-a-web-scraping-app-with-a-desktop-interface-668493ced47d



Is It a Real Email



https://isitarealemail.com/



Medium Article: "Scraping Images and Text from YouTube Videos with Node.js, Puppeteer, and Google Cloud Vision API"



https://medium.com/@djung31/scraping-images-and-text-from-youtube-videos-with-node-js-puppeteer-and-google-cloud-vision-api-3096863f486e



Onextrapixel Article: "10 Principles for Keeping Your Programming Code Clean"



https://onextrapixel.com/10-principles-for-keeping-your-programming-code-clean/



Public APIs on GitHub



https://github.com/public-apis/public-apis#health



Public APIs (Crypto Wallets)



https://github.com/n0shake/Public-APIs#cryptocurrencycrypto-wallets



BlackHatWorld Forum Post: "Scrapper for an API"



https://www.blackhatworld.com/seo/scrapper-for-an-api.1347423/#post-14594845



Medium Article: "Reverse Engineering APIs from Android Apps - Part 1"



https://medium.com/@thomas_shone/reverse-engineering-apis-from-android-apps-part-1-ea3d07b2a6c



Towards Data Science Article: "The Right Way to Build an API with Python"



https://towardsdatascience.com/the-right-way-to-build-an-api-with-python-cd08ab285f8f



Node Worker Farm on GitHub



https://github.com/rvagg/node-worker-farm



BlackHatWorld Forum Post: "100 Free Hourly Fresh HTTP(s) Proxies - Updated TXT File Every Hour - Get Per Request! No API Needed"



https://www.blackhatworld.com/seo/100-free-hourly-fresh-http-s-proxies-updated-txt-file-every-hour-get-per-request-no-api-needed.1376224/



Chain.so - Litecoin Transaction



https://chain.so/tx/LTC/3d7c1a0ed0dbc34433541d179849e31d5f3288b43360eb9a3931a1e8db90a7ac



RapidAPI - TikTok Downloader Pricing



https://rapidapi.com/maatootz/api/tiktok-downloader-download-tiktok-videos-without-watermark/pricing



Puppeteer Extra Wiki - Beginner's Guide (Regarding Using Selenium)



https://github.com/berstend/puppeteer-extra/wiki/Beginner:-I'm-new-to-scraping-and-being-blocked#youre-using-selenium



RapidAPI - Twitter32 Pricing



https://rapidapi.com/socialminer/api/twitter32/pricing



RapidTables - CSS Color Blue



https://www.rapidtables.com/web/css/css-color.html#blue



Scrapy Cloudflare Middleware on GitHub



https://github.com/clemfromspace/scrapy-cloudflare-middleware



Hackernoon Article: "How to Easily Deploy Telegram Bot Using AWS Lambda"



https://hackernoon.com/how-to-easily-deploy-telegram-bot-using-aws-lambda-kj6e35c4



YouTube Viewer on GitHub



https://github.com/MShawon/YouTube-Viewer



BlackHatWorld Forum Post: "Method: My Video Bot - $100k+ Per Month Global Post Pandemic Strategy - Email - SEO - Video Sales System - Newbie Friendly Advanced Tutorial 2021"



https://www.blackhatworld.com/seo/method-my-video-bot-100k-per-month-global-post-pandemic-strategy-email-seo-video-sales-system-newbie-friendly-advanced-tutorial-2021.1297690/



BlackHatWorld Forum Post: "Help: Cloaking for Googlebot"



https://www.blackhatworld.com/seo/help-cloaking-for-googlebot.1312334/#post-14427014



Reddit Post: "Offer: I Can Make Any Bot or Data Analysis Tool"

[MEDIA=reddit]slavelabour/comments/r520ov[/MEDIA]



FakeBrowser on GitHub



https://github.com/kkoooqq/fakebrowser



Stack Overflow Question: "When I Catch an Exception, How Do I Get the Type, File, and Line Number?"



https://stackoverflow.com/questions/1278705/when-i-catch-an-exception-how-do-i-get-the-type-file-and-line-number


BlackHatWorld Forum Post: "Server Recommendations for Instagram Webapp"


https://www.blackhatworld.com/seo/server-recommendations-for-instagram-webapp.1056113/#post-11373031


Community Post on BabloSoft: "Optimization of OS for Multithreading"



https://community.bablosoft.com/topic/17458/оптимизация-ос-под-многопоточность/2



Community Post on BabloSoft: "Development Methodologies for Newcomers in the Context of BAS"


https://community.bablosoft.com/topic/3540/методологии-разработки-для-новичков-в-контексте-bas/2


DebugJS Website


https://debugjs.net/


Community Post on BabloSoft: "Life Hacks in BAS"

https://community.bablosoft.com/topic/3521/лайфхаки-bas/102


Playwright Documentation - Open Playwright Inspector

https://playwright.dev/python/docs/inspector#open-playwright-inspector


Selenium Wire on PyPI


https://pypi.org/project/selenium-wire/#installation



BlackHatWorld Forum Post: "How to Code Right Now: Advice to a New Programmer"



https://www.blackhatworld.com/seo/how-to-code-right-now-advice-to-a-new-programmer.923056/


Chance.js Website


https://chancejs.com/index.html



BlackHatWorld Forum Post: "Virtual Environment for Bots"



https://www.blackhatworld.com/seo/virtual-enviroment-for-bots.1329452/



BlackHatWorld Forum Post: "Journey to $10,000 Passive Income Using Free Traffic"



https://www.blackhatworld.com/seo/journey-to-10-000-passive-income-using-free-traffic.1316912/page-8



BlackHatWorld Forum Post: "Generating Half a Million Pages and Making Money From It - Experiment with Spintax and Local Lead Generation"



https://www.blackhatworld.com/seo/generating-half-a-million-pages-and-making-money-from-it-experiment-with-spintax-and-local-lead-generation.1356936/



CodePen Post: "Cracking CAPTCHAs with Neural Networks"


https://codepen.io/birjolaxew/post/cracking-captchas-with-neural-networks


https://www.selenium.dev/documentation/en/



https://www.stackoverflow.com



https://www.geeksforgeeks.org/selenium-python-tutorial/



https://selenium-python.readthedocs.io/



https://www.chris-wells.net/articles/2017/09/01/consistent-selenium-testing/



**https://www.trickster.dev/https://botwiki.org/



https://www.blackhatworld.com/seo/how-to-make-killer-bots-for-fun-and-profit.760914/https://www.blackhatworld.com/seo/h...if-i-want-to-learn-browser-automation.1311931https://www.blackhatworld.com/seo/is-python-good-for-coding-a-bot.1434214/**


Upcoming guides​


Sniffing website requests - how to make bots using the private APIs of websites

Multi threading - how to run lots of browser instances from one computer

Phone automation - automating Tiktok, Instagram via Android

Browser automation studio - the ultimate tool for automation and guides
 
Last edited:
Great guide, also waiting for Multi threading - how to run lots of browser instances from one computer.
 
Great read, thx for the share.
No problem.
Wonderful post, I learned a lot from the links you shared. currently I am using Playwright and Puppeteer.
Great, let me know how it goes!
Great guide, also waiting for Multi threading - how to run lots of browser instances from one computer.
Stay tuned!
Let me know what other guides you guys want to see.
Great guide. Thanks
Thanks mate.
 
Thank you for sharing!
Great guide and great collection of resources, saving and voting for the next.
 
Great insights and details from what seems a solid experience in the field of "botting". Thanks for the thread, as well as for the first one on the topic of selenium.

Needless to say, you know that selenium is not the best tool for many botting tasks, and the next guide on " how to make bots using the private APIs of websites" is HIGHLY anticipated (at least for me). I'm a newbie in this area and I am having lots of issues with some of my login requests. Would love to hear your thoughts about this and some tips and tricks also.
 
No doubt the best ever guide I read on browser automation for botting.
Would love to see more guides from you!
Thanks a lot for compiling all these useful resources and information together.
 
I'm learning JS and hope to make my first puppeteer bot soon.
Thanks for the tips.
 
I use puppeteer with stealth plugin and have added args to run it in the background. Works like headless.
 
I use puppeteer with stealth plugin and have added args to run it in the background. Works like headless.
Does that works on a Linux server minimized?
Also, which programming language do you use to write code in?
 
the best way to stay updated and aware of the latest technologies isn't StackOverflow GitHub etc...

personally, I read white papers and the latest technologies being discovered

there is a search engine google dedicated to this

here you will gain a deep and solid understanding of detection techniques used by any social media
 
Does that works on a Linux server minimized?
Also, which programming language do you use to write code in?
No idea what is linux server exactly. If it has UI, it will probably work. However, I think it won't have Windows or Mac useragent which are common. I would recommend Mac instead of Linux. Windows good too.

Node.js puppeteer

the best way to stay updated and aware of the latest technologies isn't StackOverflow GitHub etc...

personally, I read white papers and the latest technologies being discovered

there is a search engine google dedicated to this

here you will gain a deep and solid understanding of detection techniques used by any social media
Stackoverflow is quora and reddit of programming. That doesn't sound good.

But there are some code snippets that you wouldn't find elsewhere.

I have learned more from chatgpt though.
 
We were scraping Bet365 website with great protection against selenium/undetected-chromedriver. We found out that only less known browser automation called playwright was working. Hope it will help you as well.
 
Anyone tried pyppeteer python? Is that good?
I'm learning Python and I thinking moving from Selenium to pyppeteer or playwright

Also I have question: that I'm using AdsPower, how to edit CDC string on webdriver because I think they used their modified webdriver.
 
Last edited:
Anyone tried pyppeteer python? Is that good?
I'm learning Python and I thinking moving from Selenium to pyppeteer or playwright

Also I have question: that I'm using AdsPower, how to edit CDC string on webdriver because I think they used their modified webdriver.
I recently moved from Selenium to Playwright and all i can say is that i will never look back. (Way) better documentation, no need to wait for elements (as it does that already under the hood), more readable code in general and more stable. You should definitely try that out!
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock