problems with robots.txt

Discussion in 'White Hat SEO' started by loginname, Nov 5, 2009.

  1. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    405
    Likes Received:
    14
    Hey

    I uploaded a robots.txt file which contain these settings:
    User-agent: *
    Disallow: /css/

    Google Webmaster Tools translate that into:
    ?User-agent: *
    Disallow: /css/

    PROBLEMS (as reported from Google Webmaster Tools):
    ?User-agent: * - unknows syntax
    Disallow: /css/ - no user agent specified

    is there somehting wrong with my syntax? I thought the syntax is okay, but I don't know?
     
  2. n2zen

    n2zen Regular Member

    Joined:
    Sep 27, 2009
    Messages:
    269
    Likes Received:
    70
    I know this isn't an answer to the question, but is a solution. Why are you disallowing your css?

    If it's to prevent users from browsing the css folder, then why don't you just:
    Options All -Indexes

    I see nothing wrong with the two lines you've presented. They should work.

    However, the symptom you mention (where the ? is inserted at the beginning) generally relates to ansi formatted text being presented on a utf-8 server. I've seen a few posts of this issue but not the solution. Note: I only create robots.txt with UltraEdit and haven't experienced any such issue.
     
  3. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    405
    Likes Received:
    14
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /

    So yes it's utf-8 charset, I've uploaded a new robots.txt file with utf-8 formatting

    But in Google Webmaster Tools it still report the old errors, it also report that it's 5 hours since the robots.txt was downloaded, but it's only 5 minutes since I uploaded the new robots.txt... I don't see any button/link in webmaster tools which says "reload robots.txt".. So I don't know, I hope google will go about indexing my site and using my new robots.txt file
     
  4. dheer

    dheer Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 24, 2009
    Messages:
    2,566
    Likes Received:
    1,040
    Home Page:
    well if your site don't have any private content then you don't need to use robots.txt at all. Just avoid it. let google search and index all your pages.
     
    • Thanks Thanks x 1
  5. origin

    origin Regular Member

    Joined:
    Nov 11, 2008
    Messages:
    333
    Likes Received:
    91
    G will reread robots.txt every time it crawls your site. Make sure that your robots.txt is a plain text file, not in utf-8 format. Your bot syntax is correct.
     
  6. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    405
    Likes Received:
    14
    if not in utf-8 format, which format should I then choose? I think I've tryed ANSI and that didn't help either
     
  7. origin

    origin Regular Member

    Joined:
    Nov 11, 2008
    Messages:
    333
    Likes Received:
    91
    If you are using Windows, use notepad.
     
  8. youngguy

    youngguy BANNED BANNED

    Joined:
    Apr 11, 2009
    Messages:
    1,055
    Likes Received:
    1,560
    @origin: exactly answer! delete your current robots.txt and make it again with notepad, some editors add UTF BOM header in your file that's why google translate it to ?user-agent:...
     
  9. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    405
    Likes Received:
    14
    I suppose I have to create the file from scratch, copy & paste will not do. I tryed to copy&paste into notepad but that didn't help (maybe then the text format is transferred also)....

    But the way about 4 hours ago I deleted the robots.txt file from the site, but still Google Webmaster Tools complain about the ?user-agent syntax :(

    cheers