Which do bots read first - the 301 redirect, or robots.txt? I use 301's alot, and it would be rather convenient if I could block Ahrefs et al from picking up the 301 at the source, by blocking them from the site. My #1 rankings with zero backlinks would then be a total mystery to my competitors
Google first checks the robots.txt and then spiders the page. Atleast thats what i see in my apache logs
Yeah Google first access robots.txt so if you want to block ahrefs then go ahead and blog them. Let me know also if you get success with it.
How about Ahrefs and Majestic bots - presumably they behave the same way, checking the robots.txt first? Anyone have insight on this? I don't actually want to block Google, that will destroy the 301 - I just want to block the tools that my competitors might use...
LOL, noone is accessing your htaccess. It is automatically read by your Apache server when a request is made to your website..
Robots.txt first and then 301. Robot.txt defines the rule for the spider whereas 301 is the response(about the new url) sent from your server to Google(when it request a url). @tmamedov .htaccess is the configuration file of the Apache server, so it won't be accessed elsewhere.