1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[HELP] How to use htaccess the right way?

Discussion in 'Black Hat SEO' started by francis1017, Apr 8, 2015.

  1. francis1017

    francis1017 Elite Member

    Joined:
    Feb 26, 2013
    Messages:
    1,503
    Likes Received:
    347
    Hello,

    I have a code that I want to use in my htaccess for my PBN.
    Got this code from one of threads here for blocking spiders.

    CODE 1:
    Code:
    SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
    SetEnvIfNoCase User-Agent .*exabot.* bad_bot
    SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
    SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
    SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
    SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
    SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
    SetEnvIfNoCase User-Agent .*semrushbot.* bad_bot
    SetEnvIfNoCase User-Agent .*ia_archiver.* bad_bot
    SetEnvIfNoCase User-Agent .*searchmetricsbot.* bad_bot
    SetEnvIfNoCase User-Agent .*seokicks-robot.* bad_bot
    SetEnvIfNoCase User-Agent .*sistrix.* bad_bot
    SetEnvIfNoCase User-Agent .*lipperhey spider.* bad_bot
    SetEnvIfNoCase User-Agent .*ncbot.* bad_bot
    SetEnvIfNoCase User-Agent .*backlinkcrawler.* bad_bot
    SetEnvIfNoCase User-Agent .*archive.org_bot.* bad_bot
    SetEnvIfNoCase User-Agent .*meanpathbot.* bad_bot
    SetEnvIfNoCase User-Agent .*pagesinventory.* bad_bot
    SetEnvIfNoCase User-Agent .*aboundexbot.* bad_bot
    SetEnvIfNoCase User-Agent .*spbot.* bad_bot
    SetEnvIfNoCase User-Agent .*linkdexbot.* bad_bot
    SetEnvIfNoCase User-Agent .*nutch.* bad_bot
    SetEnvIfNoCase User-Agent .*blexbot.* bad_bot
    SetEnvIfNoCase User-Agent .*ezooms.* bad_bot
    SetEnvIfNoCase User-Agent .*scoutjet.* bad_bot
    SetEnvIfNoCase User-Agent .*majestic-12.* bad_bot
    SetEnvIfNoCase User-Agent .*majestic-seo.* bad_bot
    SetEnvIfNoCase User-Agent .*dsearch.* bad_bot
    SetEnvIfNoCase User-Agent .*blekkobo.* bad_bot
    <Limit GET POST HEAD>
    Order Allow,Deny
    Allow from all
    Deny from env=bad_bot
    </Limit>
    The default code that I have on my htaccess.

    CODE 2:
    Code:
    # BEGIN WordPress
    ErrorDocument 404 /index.php?error=404
    # END WordPress
    Do I have to copy CODE1 and replace CODE2?
    I already tried this some of my PBN's are getting Internal Server Error but some don't.
    Please help. Thanks!
     
  2. xrfanatic

    xrfanatic Jr. VIP Jr. VIP

    Joined:
    Aug 28, 2010
    Messages:
    419
    Likes Received:
    176
    Location:
    http://bit.ly/slb64
    Home Page:
    Code1 should be above Code2 imo.
     
    • Thanks Thanks x 1
  3. francis1017

    francis1017 Elite Member

    Joined:
    Feb 26, 2013
    Messages:
    1,503
    Likes Received:
    347
    Thanks! I am not getting internal server errors now. But how do I know if it is working?
     
  4. snarky

    snarky Junior Member

    Joined:
    Nov 21, 2009
    Messages:
    104
    Likes Received:
    58
    Monitor your logs. Time will tell if these bots are crawling your site
     
    • Thanks Thanks x 1
  5. xrfanatic

    xrfanatic Jr. VIP Jr. VIP

    Joined:
    Aug 28, 2010
    Messages:
    419
    Likes Received:
    176
    Location:
    http://bit.ly/slb64
    Home Page:
    if you have software which can fake the referer like xrumer you can find out if logs show the visits from specific domains.
     
    • Thanks Thanks x 1
  6. rogerke

    rogerke Regular Member

    Joined:
    Oct 5, 2014
    Messages:
    264
    Likes Received:
    145
    You can spoof user agents with Screaming Frog or a web based application like this site: http://wannabrowser.com/

    Just fill in a UA like ahrefsbot and see wether it shows a 403 status code. That code is solid though.
     
    • Thanks Thanks x 2