1. This website uses cookies to improve service and provide a tailored user experience. By using this site, you agree to this use. See our Cookie Policy.
    Dismiss Notice

[GUIDE] How Ad Networks (and everyone else) Knows You're A Bot

Discussion in 'Making Money' started by AdvertisingGuy, May 9, 2019.

  1. redarrow

    redarrow Elite Member

    Joined:
    Apr 1, 2013
    Messages:
    10,753
    Likes Received:
    3,143
    surly good bot designers added this already and more ...
     
  2. AdvertisingGuy

    AdvertisingGuy Junior Member

    Joined:
    May 8, 2019
    Messages:
    143
    Likes Received:
    145
    Maybe they have after this guide!! :D But there are still more ways to catch things.

    For real though, this is not common knowledge. The companies that detect bots online have sold for billions of dollars combined, this isn't easy stuff.
     
  3. redarrow

    redarrow Elite Member

    Joined:
    Apr 1, 2013
    Messages:
    10,753
    Likes Received:
    3,143
    will it help having it added for making accounts aswell

    i use portable firefox
     
  4. naskootbg

    naskootbg Senior Member

    Joined:
    Nov 8, 2010
    Messages:
    824
    Likes Received:
    266
    Home Page:
    Hm, not sure, that actual click is required. Possibly navigating the page and than set fake refferer will not be catched.

    Selenium can click everywhere, because it can also execute javascripts. The links you calling unclickable are in iframes and this is the reason selenium not clicking - the iframe must be on focus first (like I'm not a robot checkbox). There only have harder for clicking links - even in the example with I'm not a robot, where it add this checkbox in random number of iframes it can still be clicked with "for loop" +"try catch" or with "if"+"goto".
    Other variant for selenium to do not click a link is if the link/button have layer on the top and the programmer setting the xpath/selector.. on the layer instead on actual link.
     
  5. AdvertisingGuy

    AdvertisingGuy Junior Member

    Joined:
    May 8, 2019
    Messages:
    143
    Likes Received:
    145
    I guess you can set a fake referrer but again, the referrer has to be legitimate. If you put in a fake Outbrain referrer url then they're really easy to spot.

    You sound like you know a decent amount about programming!
     
  6. SWExplorer

    SWExplorer Newbie

    Joined:
    Aug 20, 2018
    Messages:
    30
    Likes Received:
    6
    Very interesting, bookmarked and thank for your contribution!!
     
  7. naskootbg

    naskootbg Senior Member

    Joined:
    Nov 8, 2010
    Messages:
    824
    Likes Received:
    266
    Home Page:
    No, I learned from bhw threads and did few tests and read few stackoverflow q&a.

    Actually I'm not even sure selenium do real click on the links, but selenium can execute ctrl+shft+j/k and than write javascript in console.
    Selenium is still not made for cheating and on questions like how to set browser cookies may get thumbdown.

    Finding cheaters is much easier than that, the problem is add network profit from them and not really care. Even adwords wich must be most protected - they allow cheaters, I'm loosing money, cheaters getting banned and my money goes in google. Fact!

    Edit: not mean bot clicks on adwors. I mean competitors with few employes that clicking on adds everyday using vpn.

    Edit: if you realy want to explain how engines find you are bot you need to go from the HTTP requests. Like server-->browser-->request-->server

    1. Server sending request to browser - html and javascript are most common (xml, json images, asp)
    2. Browser parce and display server responce
    3. User sends requests back to server via browser
    4. Server respond to requests back to browser with same cammon languages

    Step 3 is for what about @AdvertisingGuy write about and not cover much :).

    Now ignore the browser like bots do. It become:
    1. server sending responce to bot
    2. bot parce and set responce back to server set by user setings
    3. server return new responce
    4. bot parcing the respond and display somehow
    Each http request have few major headers - cookies, referrer, useragent, respond code and post/get.

    Bots do in step 2 - this is where the cheater hitting the server. It need to send to servers a lot of data to looks like real human when the site is wll protected. As example pixel loading through javascript in browser side - the bot must handle this request as a lot more related and return back to server with the correct tokens. Not that hard if you have good programs and enough practice - I know how, but I can't and I have not interest to do this.

    Hope this point helps.
    Questions?
     
    • Thanks Thanks x 1
    Last edited: May 30, 2019
  8. AdvertisingGuy

    AdvertisingGuy Junior Member

    Joined:
    May 8, 2019
    Messages:
    143
    Likes Received:
    145
    It's really interesting. I don't you contradicted anything that I wrote in my posts. Ultimately, the ad networks will identify cookies, referrers, user agents, etc. and if you're just running a bot without any of those details, you'll get ID'ed. The graphics card is key here as well--if you're not loading anything on the screen, the graphics card will not be identified.

    Thanks for the thoughtful response too.
     
  9. vitacplus1

    vitacplus1 Newbie

    Joined:
    Aug 11, 2015
    Messages:
    14
    Likes Received:
    1
    anyone thought of building a real machine that automates click around with a real mouse?
     
  10. AdvertisingGuy

    AdvertisingGuy Junior Member

    Joined:
    May 8, 2019
    Messages:
    143
    Likes Received:
    145
    Even if you did that, you would/could still get flagged with all of the other bot flags I mentioned.
     
  11. Jibri Wright

    Jibri Wright Junior Member

    Joined:
    Jul 17, 2019
    Messages:
    114
    Likes Received:
    37
    Gender:
    Male
    Hey thanks for all of this, you've been a huge help! I was wondering if you know this for a fact? Do they send back empty arrays like this '[ ]' ?
     
  12. Fupim643

    Fupim643 Newbie

    Joined:
    Aug 17, 2019
    Messages:
    4
    Likes Received:
    0
    You have my recognition because of this post
     
  13. AdvertisingGuy

    AdvertisingGuy Junior Member

    Joined:
    May 8, 2019
    Messages:
    143
    Likes Received:
    145
    Really good question. In my experience, iOS mobile devices using Mobile Safari don't allow JavaScript tracking at all. So the average response should be Null / blank.

    For Android phones using Chrome, the navigator.plugins is blank (or just an empty array like so, basically just as you described: []), but webrtc.mediadevices will have something like the below:

    Code:
    [{"deviceId":"default","groupId":"bc3c746706f4d88fa2896ee6510d4dc80997db51f9c1b65b87c3c30d73649216","kind":"audioinput","label":null},{"deviceId":"a5efbe00de506b87c9f68e8ef8dd163a7bf663f191a12eaa89d68e1869d3dbd6","groupId":"ef215003f6ed2f411b4ed5eb5891837b303038fd02acaa0e732a3ac65e4065d5","kind":"audioinput","label":null},{"deviceId":"97c229f3643c14fa0198135c7d1e262b22133590681910f84b7287917ad99e96","groupId":"db11fdc36eae35daba5ae8601ff08e973d3c93a45f76f4393c948f8e1166e363","kind":"audioinput","label":null},{"deviceId":"356249f81ae40f812edcbd40af078eef29648574eac6627dccaf125bfdae20db","groupId":"36a75c81ed8cef07a3f359cb852d8caffeda215ffe51ba024175614bdd645a40","kind":"videoinput","label":null},{"deviceId":"a5efbe00de506b87c9f68e8ef8dd163a7bf663f191a12eaa89d68e1869d3dbd6","groupId":"1fd6f8324360db6bbc830e4e780b5f4c69cea81a008df4e06d3340606737b6f3","kind":"videoinput","label":null},{"deviceId":"default","groupId":"default","kind":"audiooutput","label":null}]
    
    Mobile Android Chrome users will also pass on things link font, screen size, etc. So you're pretty clear there.
     
    • Thanks Thanks x 1
  14. Andrew2019

    Andrew2019 Jr. VIP Jr. VIP

    Joined:
    Jan 26, 2019
    Messages:
    109
    Likes Received:
    28
    Gender:
    Male
    Location:
    Ukraine
    You got my recognition because of this post.
     
    • Thanks Thanks x 1
  15. Chatoyant

    Chatoyant Newbie

    Joined:
    Aug 27, 2019
    Messages:
    2
    Likes Received:
    1
    Occupation:
    Loser
    Location:
    NA
    This is all very interesting, and I really appreciate the work you've put in to making these posts, there are detection methods that I have never heard of that you have listed here.

    I actually just made a post in the scripting forums about how I'd been working on a reddit account generation bot that has been getting shadow ban hammered. I've been trying to find out why it has been happening to my accounts.

    I built it using puppeteer, and had modifications from the community that was supposed to allow me to perform the functions unnoticed. I got the account creator working and was cranking out about 400 accounts a day but all accounts were being shadow banned after 24-48 hours. I haven't got a clue how they are identifying my account groups. Well I do have some ideas, but I don't know how plausible they are.

    A question though. Do you think it is possible that my account creator could be getting identified off of the pattern of accounts being created? They are all being made with different IP addresses, but they are all cellular IP's from the same state in the US. I'm using my cell phone as a rotating cellular proxy.

    After reading your posts though, I'm seeing that there are quite a few other things that it could be as well :(

    Thanks for your contributions though! Forgive me if you already answered this elsewhere in the thread, but is there a framework that you suggest using to build bots? I saw a couple of suggestions from other people, but I would like to hear your suggestion if you have one.

    Another possible pitfall that I don't know if you planned to mention. I reckon that a lot of sites are also analyzing a visitors cookies in order to ascertain if they seem to be a normal user based on the cookie types, ages, and contents. I assume that they can profile visitors this way as well no? I was looking for a package on github that allowed you to spoof/fabricate cookies randomly in order to appear as if the browser instance had a history. I haven't really found anything, though I could be looking in the wrong place. Puppeteer already has the functionality to dump browser cookies obtained, but I need something that creates random 'identities' for each browser. Keeping cookies from bot activities is interesting though, and I think I'll be using it in my next project as well.

    I'm through screwing around with this reddit bot for now, been messing with it for the last 2 weeks. The first next step that I would probably do is use rotating private proxies to make sure that I'm not getting regionally recognized. I want to beat this thing though, I thought the reCaptcha was the hard part for me, but this is quite a bit more complicated than I could imagine.