1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Unknown robot eating up bandiwdth

Discussion in 'Web Hosting' started by gangsta1, Oct 27, 2012.

  1. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    In my AW stats I have almost 1tb being eaten up by:

    Unknown robot (identified by empty user agent string)

    How can I block this?
     
  2. kvmcable

    kvmcable Supreme Member

    Joined:
    Dec 28, 2010
    Messages:
    1,355
    Likes Received:
    2,815
    Occupation:
    24 year business owner - old school dude
    Location:
    KFC - BW3
    Block the IP or IP range if the bot isn't identifiable or ignore robots file
     
  3. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    The bot is unknown therefore I cannot find the IP. I cannot ignore bots as there are valid bots crawling the site. Is there a way to fin the IP of that "unknown" bot?
     
  4. alternatesword

    alternatesword Jr. VIP Jr. VIP

    Joined:
    Aug 25, 2012
    Messages:
    2,323
    Likes Received:
    484
    Location:
    scabbard
    Home Page:
  5. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    Unknown robot (identified by empty user agent string) 73,913+96 73.38 GB
     
  6. TheViceroy

    TheViceroy BANNED BANNED

    Joined:
    Jul 3, 2012
    Messages:
    201
    Likes Received:
    39
    On one of my proxy sites I have this issue, and I've activated cloudflare to stop it. So far, it's been surprisingly effective. One day I had 20,000 bogus robot requests and almost all of those have been eliminated.
     
  7. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    What cloudflare package do you use? Did it stop all the bogus bots?
     
  8. TheViceroy

    TheViceroy BANNED BANNED

    Joined:
    Jul 3, 2012
    Messages:
    201
    Likes Received:
    39
    I believe I'm just using the free package. It didn't cost me anything.
     
  9. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    Cloudflare requires DNS change I am not sure I want to do that.
    Was thinking it would be a htaccess or robots.txt edit...

    Anyone?!
     
  10. cgimaster

    cgimaster Power Member

    Joined:
    Jun 30, 2012
    Messages:
    525
    Likes Received:
    311
    Gender:
    Male
    Use htaccess and make a rule to block browsers with empty useragent string ;)


    Something like this "untested may need some tweeks"
    Code:
    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} ^ScumBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^AnotherIntruder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^\s+$ [OR]
    RewriteCond %{HTTP_USER_AGENT} ^$
    RewriteRule ^.* - [F]
    The F flag stands for forbidden so it will redirect those 4 different user agents to a forbidden page.

    3rd useragent is for useragents that are only spaces without any identification and last one is for empty useragents, the 2 above are for custom bots if you want to block any.

    Also will recommend you to check your apache access log and grab some of the ips being used by the spammer and from what network they are coming from, depending if they are exclusive dedicated ip, you can block that datacenter range to prevent this for being further scaled.
     
    • Thanks Thanks x 1
    Last edited: Oct 27, 2012
  11. Scripteen

    Scripteen Elite Member

    Joined:
    Sep 19, 2009
    Messages:
    1,811
    Likes Received:
    1,918
    Home Page:
    Awstats has a section where it tells you the top ips that consumed bandwidth. Do whois on the first 3 and see which one you want to block.

    You can also do htaccess blocking or php blocking. Or better redirect them somewhere they don't expect.
     
  12. gangsta1

    gangsta1 Regular Member

    Joined:
    Oct 12, 2009
    Messages:
    229
    Likes Received:
    20
    The unknown robot does not show up an ip, but the top ip's are comcast and other big cable companies?!