1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

.htaccess issue

Discussion in 'White Hat SEO' started by the_demon, Jan 13, 2015.

  1. the_demon

    the_demon Jr. Executive VIP

    Joined:
    Nov 23, 2008
    Messages:
    3,225
    Likes Received:
    1,594
    Occupation:
    Search Engine Marketing
    Location:
    The Internet
    Hi guys,

    I have a wordpress site that when Google tries fetching the robots.txt page it's throwing an error and Google is showing a 302 redirect to the home page. Oddly enough it works fine with the www version, but not with just http://example.com.

    I tried adding an exclusion in the .htaccess file, but my modification caused the url to show a problem with the .htaccess. Thee error I get is: [an error occurred while processing this directive] the code I added to the .htaccess file is highlighted in red below. I even tried 301ing the robots.txt file to the www version of the site and that didn't work either.

    If someone could help me find a working solution that would be much apppreciated.

    Code:
    RewriteEngine On
    [COLOR="#FF0000"]RewriteRule ^robots.txt$ - [L][/COLOR]
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    
    # add a trailing slash to /wp-admin
    RewriteRule ^([_0-9a-zA-Z-]+/)?wp-admin$ $1wp-admin/ [R=301,L]
    
    RewriteCond %{REQUEST_FILENAME} -f [OR]
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^ - [L]
    RewriteRule ^([_0-9a-zA-Z-]+/)?(wp-(content|admin|includes).*) $2 [L]
    RewriteRule ^([_0-9a-zA-Z-]+/)?(.*\.php)$ $2 [L]
    RewriteRule . index.php [L]
    
     
  2. Pornguy

    Pornguy Regular Member

    Joined:
    Nov 29, 2012
    Messages:
    320
    Likes Received:
    107
    Home Page:
    Do you see the file in your root directory for robot.txt ?
     
  3. the_demon

    the_demon Jr. Executive VIP

    Joined:
    Nov 23, 2008
    Messages:
    3,225
    Likes Received:
    1,594
    Occupation:
    Search Engine Marketing
    Location:
    The Internet
    Yes, the robots.txt is in the root directory.
     
  4. nikiobicata

    nikiobicata Regular Member

    Joined:
    Mar 4, 2011
    Messages:
    453
    Likes Received:
    376
    Occupation:
    IT Director
    Location:
    New York
    Home Page:
    So you want to have 2 robots.txt files for every domain? One with www and the other one with http?

    Try this code :

    RewriteCond %{HTTP_HOST} ^.*example\.(.*)$
    RewriteRule ^robots\.txt$ /robots-%1\.txt [L]

    Check your google webmaster for 302 redirect. Sometime we forget things :)
     
    Last edited: Jan 13, 2015
  5. the_demon

    the_demon Jr. Executive VIP

    Joined:
    Nov 23, 2008
    Messages:
    3,225
    Likes Received:
    1,594
    Occupation:
    Search Engine Marketing
    Location:
    The Internet
    No, I don't want to have 2 separate robots.txt files. The problem is that for some reason when the robots.txt is accessed without the WWW in the URL it does a 302 redirect to the home page. So if you go to http://example.com/robots.txt it redirects to http://www.example.com ... I didn't setup any 302 redirects (that I know of) so I don't know why this is happening.
     
  6. nikiobicata

    nikiobicata Regular Member

    Joined:
    Mar 4, 2011
    Messages:
    453
    Likes Received:
    376
    Occupation:
    IT Director
    Location:
    New York
    Home Page:
    You should just hide your robots.txt file from everyone except google, Yahoo and bing. This is the best security practice. Here is guide how to do it:

    1. Put this code in to your .htaccess file

    Code:
    RewriteEngine On
    
    RewriteCond %{http_user_agent} !(googlebot|Msnbot|Slurp) [NC]
    RewriteRule ^robots\.txt$ http://www.yoursait.com/  [R,NE,L]
    AddHandler application/x-httpd-php .txt
    2. Create smartass.php file (or whatever name you like) Paste this code inside the file

    Code:
    [COLOR=#000000][B]<?php[/B][/COLOR]
    
    [COLOR=#000088]$ua[/COLOR] [COLOR=#339933]=[/COLOR] [COLOR=#000088]$_SERVER[/COLOR][COLOR=#009900][[/COLOR][COLOR=#0000ff]'HTTP_USER_AGENT'[/COLOR][COLOR=#009900]][/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#b1b100]if[/COLOR][COLOR=#009900]([/COLOR][URL="http://www.php.net/stristr"][COLOR=#990000]stristr[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#000088]$ua[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#0000ff]'msnbot'[/COLOR][COLOR=#009900])[/COLOR] [COLOR=#339933]||[/COLOR] [URL="http://www.php.net/stristr"][COLOR=#990000]stristr[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#000088]$ua[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#0000ff]'Googlebot'[/COLOR][COLOR=#009900])[/COLOR] [COLOR=#339933]||[/COLOR] [URL="http://www.php.net/stristr"][COLOR=#990000]stristr[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#000088]$ua[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#0000ff]'Yahoo Slurp'[/COLOR][COLOR=#009900])[/COLOR][COLOR=#009900])[/COLOR][COLOR=#009900]{[/COLOR]
    [COLOR=#000088]$ip[/COLOR] [COLOR=#339933]=[/COLOR] [COLOR=#000088]$_SERVER[/COLOR][COLOR=#009900][[/COLOR][COLOR=#0000ff]'REMOTE_ADDR'[/COLOR][COLOR=#009900]][/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#000088]$hostname[/COLOR] [COLOR=#339933]=[/COLOR] [URL="http://www.php.net/gethostbyaddr"][COLOR=#990000]gethostbyaddr[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#000088]$ip[/COLOR][COLOR=#009900])[/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#b1b100]if[/COLOR][COLOR=#009900]([/COLOR][COLOR=#339933]![/COLOR][URL="http://www.php.net/preg_match"][COLOR=#990000]preg_match[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#0000ff]"/\.googlebot\.com$/"[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#000088]$hostname[/COLOR][COLOR=#009900])[/COLOR] [COLOR=#339933]&&![/COLOR][URL="http://www.php.net/preg_match"][COLOR=#990000]preg_match[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#0000ff]"/search\.live\.com$/"[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#000088]$hostname[/COLOR][COLOR=#009900])[/COLOR] [COLOR=#339933]&&![/COLOR][URL="http://www.php.net/preg_match"][COLOR=#990000]preg_match[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#0000ff]"/crawl\.yahoo\.net$/"[/COLOR][COLOR=#339933],[/COLOR] [COLOR=#000088]$hostname[/COLOR][COLOR=#009900])[/COLOR][COLOR=#009900])[/COLOR] [COLOR=#009900]{[/COLOR]
    [COLOR=#000088]$block[/COLOR] [COLOR=#339933]=[/COLOR] [COLOR=#009900][B]TRUE[/B][/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#000088]$URL[/COLOR][COLOR=#339933]=[/COLOR][COLOR=#0000ff]"/"[/COLOR][COLOR=#339933];[/COLOR]
    [URL="http://www.php.net/header"][COLOR=#990000]header[/COLOR][/URL] [COLOR=#009900]([/COLOR][COLOR=#0000ff]"Location: [COLOR=#006699][B]$URL[/B][/COLOR]"[/COLOR][COLOR=#009900])[/COLOR][COLOR=#339933];[/COLOR]
    [URL="http://www.php.net/exit"][COLOR=#990000]exit[/COLOR][/URL][COLOR=#339933];[/COLOR]
    [COLOR=#009900]}[/COLOR] [COLOR=#b1b100]else[/COLOR] [COLOR=#009900]{[/COLOR]
    [COLOR=#000088]$real_ip[/COLOR] [COLOR=#339933]=[/COLOR] [URL="http://www.php.net/gethostbyname"][COLOR=#990000]gethostbyname[/COLOR][/URL][COLOR=#009900]([/COLOR][COLOR=#000088]$hostname[/COLOR][COLOR=#009900])[/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#b1b100]if[/COLOR][COLOR=#009900]([/COLOR][COLOR=#000088]$ip[/COLOR][COLOR=#339933]!=[/COLOR] [COLOR=#000088]$real_ip[/COLOR][COLOR=#009900])[/COLOR][COLOR=#009900]{[/COLOR]
    [COLOR=#000088]$block[/COLOR] [COLOR=#339933]=[/COLOR] [COLOR=#009900][B]TRUE[/B][/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#000088]$URL[/COLOR][COLOR=#339933]=[/COLOR][COLOR=#0000ff]"/"[/COLOR][COLOR=#339933];[/COLOR]
    [URL="http://www.php.net/header"][COLOR=#990000]header[/COLOR][/URL] [COLOR=#009900]([/COLOR][COLOR=#0000ff]"Location: [COLOR=#006699][B]$URL[/B][/COLOR]"[/COLOR][COLOR=#009900])[/COLOR][COLOR=#339933];[/COLOR]
    [URL="http://www.php.net/exit"][COLOR=#990000]exit[/COLOR][/URL][COLOR=#339933];[/COLOR]
    [COLOR=#009900]}[/COLOR] [COLOR=#b1b100]else[/COLOR] [COLOR=#009900]{[/COLOR]
    [COLOR=#000088]$block[/COLOR] [COLOR=#339933]=[/COLOR] [COLOR=#009900][B]FALSE[/B][/COLOR][COLOR=#339933];[/COLOR]
    [COLOR=#009900]}[/COLOR]
    [COLOR=#009900]}[/COLOR]
    [COLOR=#009900]}[/COLOR]
    [COLOR=#000000][B]?>[/B][/COLOR]
    
    
    Upload the file to your root directory (the same location where is .htaccess file)

    3. Open robots.txt and paste this line of code

    Code:
    [COLOR=#000000][B]<?php[/B][/COLOR] [COLOR=#b1b100]include[/COLOR][COLOR=#009900]([/COLOR][COLOR=#0000ff]"smartass.php"[/COLOR][COLOR=#009900])[/COLOR][COLOR=#339933];[/COLOR] [COLOR=#000000][B]?>[/B][/COLOR]
    
     
    
    You are done! Now if someone want to see your robots.txt file and this one is not google, Yahoo or bing he will be redirect to your homepage :)

    Always hide your robots.txt file! You don't want some smart ass kids mess with your website and try to hack you :)
     
    Last edited: Jan 14, 2015
  7. ttrox

    ttrox Regular Member

    Joined:
    Jun 28, 2013
    Messages:
    217
    Likes Received:
    76
    Does your website always redirect from www-less version to www one? Using http://example.com/whatever redirects you to http://www.example.com?

    If it does, it is clearly redirecting even before getting to those directives. What I mean is, when you had the "[an error occurred while processing this directive]" what was your URL? http://example.com or http://www.example.com?

    If it was http://www.example.com then you might be pointing the www subdomain somewhere else instead of having it as an alias for the domain, as an error on directives should be the first error you see on the screen before anything else.

    I can't tell much more with this amount of info, but try out what I said to understand what's causing the problem.
     
  8. the_demon

    the_demon Jr. Executive VIP

    Joined:
    Nov 23, 2008
    Messages:
    3,225
    Likes Received:
    1,594
    Occupation:
    Search Engine Marketing
    Location:
    The Internet
    Yes, my website redirects non-www to www version. Although, what I've noticed is when you visit a given page and then take out the www it redirects to the home page. So this problem extends even passed the robots.txt issue.

    *UPDATE*
    I was able to solve my problem. Apparently I needed to add this code to the header of my .htaccess file. I still don't know why the robots.txt file was 302ing to the home page, but regardless this fix seems to have worked.

    Code:
    RewriteCond %{HTTP_HOST} ^mydomain.com [NC]
    RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301]
    
    @MODS: Please close this thread as the issue has been resolved.
     
    Last edited: Jan 14, 2015