Up to date .htaccess code for blocking crawling websites

radichone

Junior Member
Joined
Nov 8, 2015
Messages
119
Reaction score
5
Hello guys,
Anyone of you has an up to date .htaccess code for blocking major crawling websites: ahrefs,semrush,moz...
Thank you
 
This is the most recent one i've found, should be good to use.


# BEGIN WordPress
SetEnvIfNoCase User-Agent .*semrushbot.* bad_bot
SetEnvIfNoCase User-Agent .*Semrush.* bad_bot
SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
SetEnvIfNoCase User-Agent .*exabot.* bad_bot
SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
SetEnvIfNoCase User-Agent .*ia_archiver.* bad_bot
SetEnvIfNoCase User-Agent .*searchmetricsbot.* bad_bot
SetEnvIfNoCase User-Agent .*seokicks-robot.* bad_bot
SetEnvIfNoCase User-Agent .*sistrix.* bad_bot
SetEnvIfNoCase User-Agent .*lipperhey spider.* bad_bot
SetEnvIfNoCase User-Agent .*ncbot.* bad_bot
SetEnvIfNoCase User-Agent .*backlinkcrawler.* bad_bot
SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
SetEnvIfNoCase User-Agent .*aboundexbot.* bad_bot
SetEnvIfNoCase User-Agent .*spbot.* bad_bot
SetEnvIfNoCase User-Agent .*linkdexbot.* bad_bot
SetEnvIfNoCase User-Agent .*nutch.* bad_bot
SetEnvIfNoCase User-Agent .*majestic-12.* bad_bot
SetEnvIfNoCase User-Agent .*majestic- seo.* bad_bot
SetEnvIfNoCase User-Agent .*dsearch.* bad_bot
SetEnvIfNoCase User-Agent .*archive.org_bot.* bad_bot
SetEnvIfNoCase User-Agent .*meanpathbot.* bad_bot
SetEnvIfNoCase User-Agent .*pagesinventory.* bad_bot
SetEnvIfNoCase User-Agent .*BLEXbot.* bad_bot
SetEnvIfNoCase User-Agent .*ezooms.* bad_bot
SetEnvIfNoCase User-Agent .*scoutjet.* bad_bot
SetEnvIfNoCase User-Agent .*blekkobo.* bad_bot
SetEnvIfNoCase User-Agent .*screaming frog seo spider/*.* bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>


# END WordPress
 
don't use HTA code
bots are changing their User-Agent all the time
if you are using WP
then you should use
spider spanker
 
Back
Top