1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

problems with robots.txt

Discussion in 'White Hat SEO' started by loginname, Nov 5, 2009.

  1. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    371
    Likes Received:
    13
    Hey

    I uploaded a robots.txt file which contain these settings:
    User-agent: *
    Disallow: /css/

    Google Webmaster Tools translate that into:
    ?User-agent: *
    Disallow: /css/

    PROBLEMS (as reported from Google Webmaster Tools):
    ?User-agent: * - unknows syntax
    Disallow: /css/ - no user agent specified

    is there somehting wrong with my syntax? I thought the syntax is okay, but I don't know?
     
  2. n2zen

    n2zen Regular Member

    Joined:
    Sep 27, 2009
    Messages:
    269
    Likes Received:
    70
    I know this isn't an answer to the question, but is a solution. Why are you disallowing your css?

    If it's to prevent users from browsing the css folder, then why don't you just:
    Options All -Indexes

    I see nothing wrong with the two lines you've presented. They should work.

    However, the symptom you mention (where the ? is inserted at the beginning) generally relates to ansi formatted text being presented on a utf-8 server. I've seen a few posts of this issue but not the solution. Note: I only create robots.txt with UltraEdit and haven't experienced any such issue.
     
  3. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    371
    Likes Received:
    13
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /

    So yes it's utf-8 charset, I've uploaded a new robots.txt file with utf-8 formatting

    But in Google Webmaster Tools it still report the old errors, it also report that it's 5 hours since the robots.txt was downloaded, but it's only 5 minutes since I uploaded the new robots.txt... I don't see any button/link in webmaster tools which says "reload robots.txt".. So I don't know, I hope google will go about indexing my site and using my new robots.txt file
     
  4. dheer

    dheer Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 24, 2009
    Messages:
    2,443
    Likes Received:
    1,029
    Home Page:
    well if your site don't have any private content then you don't need to use robots.txt at all. Just avoid it. let google search and index all your pages.
     
    • Thanks Thanks x 1
  5. origin

    origin Regular Member

    Joined:
    Nov 11, 2008
    Messages:
    334
    Likes Received:
    90
    Home Page:
    G will reread robots.txt every time it crawls your site. Make sure that your robots.txt is a plain text file, not in utf-8 format. Your bot syntax is correct.
     
  6. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    371
    Likes Received:
    13
    if not in utf-8 format, which format should I then choose? I think I've tryed ANSI and that didn't help either
     
  7. origin

    origin Regular Member

    Joined:
    Nov 11, 2008
    Messages:
    334
    Likes Received:
    90
    Home Page:
    If you are using Windows, use notepad.
     
  8. youngguy

    youngguy Senior Member

    Joined:
    Apr 11, 2009
    Messages:
    1,053
    Likes Received:
    1,560
    Location:
    Hell
    @origin: exactly answer! delete your current robots.txt and make it again with notepad, some editors add UTF BOM header in your file that's why google translate it to ?user-agent:...
     
  9. loginname

    loginname Regular Member

    Joined:
    Oct 1, 2008
    Messages:
    371
    Likes Received:
    13
    I suppose I have to create the file from scratch, copy & paste will not do. I tryed to copy&paste into notepad but that didn't help (maybe then the text format is transferred also)....

    But the way about 4 hours ago I deleted the robots.txt file from the site, but still Google Webmaster Tools complain about the ?user-agent syntax :(

    cheers