1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Up to date .htaccess code for blocking crawling websites

Discussion in 'Black Hat SEO' started by radichone, Sep 20, 2016.

Tags:
  1. radichone

    radichone Registered Member

    Joined:
    Nov 8, 2015
    Messages:
    85
    Likes Received:
    2
    Gender:
    Male
    Location:
    Lebanon
    Hello guys,
    Anyone of you has an up to date .htaccess code for blocking major crawling websites: ahrefs,semrush,moz...
    Thank you
     
  2. CarlSagan

    CarlSagan Junior Member

    Joined:
    Jun 3, 2016
    Messages:
    117
    Likes Received:
    30
    Gender:
    Male
    This is the most recent one i've found, should be good to use.


    # BEGIN WordPress
    SetEnvIfNoCase User-Agent .*semrushbot.* bad_bot
    SetEnvIfNoCase User-Agent .*Semrush.* bad_bot
    SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
    SetEnvIfNoCase User-Agent .*exabot.* bad_bot
    SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
    SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
    SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
    SetEnvIfNoCase User-Agent .*ia_archiver.* bad_bot
    SetEnvIfNoCase User-Agent .*searchmetricsbot.* bad_bot
    SetEnvIfNoCase User-Agent .*seokicks-robot.* bad_bot
    SetEnvIfNoCase User-Agent .*sistrix.* bad_bot
    SetEnvIfNoCase User-Agent .*lipperhey spider.* bad_bot
    SetEnvIfNoCase User-Agent .*ncbot.* bad_bot
    SetEnvIfNoCase User-Agent .*backlinkcrawler.* bad_bot
    SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
    SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
    SetEnvIfNoCase User-Agent .*aboundexbot.* bad_bot
    SetEnvIfNoCase User-Agent .*spbot.* bad_bot
    SetEnvIfNoCase User-Agent .*linkdexbot.* bad_bot
    SetEnvIfNoCase User-Agent .*nutch.* bad_bot
    SetEnvIfNoCase User-Agent .*majestic-12.* bad_bot
    SetEnvIfNoCase User-Agent .*majestic- seo.* bad_bot
    SetEnvIfNoCase User-Agent .*dsearch.* bad_bot
    SetEnvIfNoCase User-Agent .*archive.org_bot.* bad_bot
    SetEnvIfNoCase User-Agent .*meanpathbot.* bad_bot
    SetEnvIfNoCase User-Agent .*pagesinventory.* bad_bot
    SetEnvIfNoCase User-Agent .*BLEXbot.* bad_bot
    SetEnvIfNoCase User-Agent .*ezooms.* bad_bot
    SetEnvIfNoCase User-Agent .*scoutjet.* bad_bot
    SetEnvIfNoCase User-Agent .*blekkobo.* bad_bot
    SetEnvIfNoCase User-Agent .*screaming frog seo spider/*.* bad_bot
    <Limit GET POST HEAD>
    Order Allow,Deny
    Allow from all
    Deny from env=bad_bot
    </Limit>

    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>


    # END WordPress
     
    • Thanks Thanks x 1
  3. radichone

    radichone Registered Member

    Joined:
    Nov 8, 2015
    Messages:
    85
    Likes Received:
    2
    Gender:
    Male
    Location:
    Lebanon
    Thank you!
     
  4. andrew1978

    andrew1978 Regular Member

    Joined:
    May 13, 2012
    Messages:
    441
    Likes Received:
    28
    Gender:
    Male
    Occupation:
    Internet Marketer
    don't use HTA code
    bots are changing their User-Agent all the time
    if you are using WP
    then you should use
    spider spanker