Selenium fingerprint spoofing

Kizianap · Apr 19, 2021

Hello guys,

Currently im working on a scraper(newbie). My current stack is python + selenium firefox running on ubuntu 18, and the server is on AWS.
A website keeps blocking me from time to time like 3 out 10 and sometimes recaptcha doesnt load, it seems to be using Incapsula.

So far i did to my bot:
- Human like behaviors
- rotating user agent
- spoof http request
- im using proxy from proxymesh
- random delay timers
- dom.enabled.webdriver, false
- disabled webrtc
- etc for the webdriver configs

Multilogin is not an option for me since it is expensive.
Currently my guesses are that im being block because:
-my proxy has some bad ip pool
-it detects im using linux x86_64(in deviceinfo.me)
-webgl and canvas not spoofed

Im working on team rn. As i've read some post here i realized that we shouldn't be using linux to a site that has bot detector, or am i wrong? All our scraper is made from these stack. Should I invest my time on puppeteer instead? What do you think?

be9hop · Apr 27, 2021

Have you tried other proxies? I've noticed I hit a lot more captchas when I'm using subpar proxies. Nothing against proxymesh as I have no experience with them.

itsbrnbs · Jun 22, 2021

What about Kameleo?

b2btradeworld · Jun 22, 2021

Do you use chrome?

tirycm · Jun 28, 2021

Try undetected-chromedriver .
selenium.webdriver.Chrome replacement wiht compatiblity for Brave, and other Chromium based browsers. not triggered by CloudFlare/Imperva/hCaptcha and such. NOTE: results may vary due to many factors. No guarantees are given, except for ongoing efforts in understanding detection algorithms.

devgrizzly · Jun 28, 2021

use multilogin

or use puppeteer, selenium is old

The John Bull · Jun 28, 2021

There was a post on hacker news today about this well worth reading the comments

https://news.ycombinator.com/item?id=27648719
The article they're referring to if a blog post on multilogin.

First thing I'd do is work out which anti bot technology the site you're working on is using and then start googling how to mitigate it.

Code Docta · Jul 22, 2021

I second undetected-chromedriver, not sure how to stop webRTC leaks yet tho, I may be missing something. Please advise if one knows. thnx

daibu · Jul 27, 2021

I don't know all the specifics of why, but it's pretty common knowledge that the best way to be stealthy is by using Puppeteer or Playwright with a detection evasion library. There are apparently some trivial ways to detect if a user is using Selenium. Puppeteer uses the CDP, which basically just sends events to the Chrome browser in the same way your mouse and keyboard do.

Selenium fingerprint spoofing

Kizianap

Newbie

be9hop

Regular Member

itsbrnbs

Newbie

b2btradeworld

BANNED

tirycm

Newbie

devgrizzly

Regular Member

The John Bull

Newbie

Code Docta

Registered Member

daibu

Junior Member

Main Menu

Marketplace

Making Money

BlackHat World