1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

robots txt question

Discussion in 'Blogging' started by crysis1, Apr 26, 2009.

  1. crysis1

    crysis1 Junior Member

    Joined:
    Apr 3, 2009
    Messages:
    100
    Likes Received:
    44
    How does this look for the robots.txt????


    Sitemap: /sitemap.xml

    User-agent: *
    Disallow: /wp-content/cache/
    Disallow: /wp-content/themes/
    Disallow: /wp-content/plugins/
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /wp-login.php


    # disallow all files in these directories
    Disallow: /cgi-bin/
    Disallow: /z/j/
    Disallow: /z/c/
    Disallow: /stats/
    Disallow: /dh_
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /contact/
    Disallow: /tag/
    Disallow: /wp-content/b
    Disallow: /wp-content/p
    Disallow: /wp-content/themes/askapache/4
    Disallow: /wp-content/themes/askapache/c
    Disallow: /wp-content/themes/askapache/d
    Disallow: /wp-content/themes/askapache/f
    Disallow: /wp-content/themes/askapache/h
    Disallow: /wp-content/themes/askapache/in
    Disallow: /wp-content/themes/askapache/p
    Disallow: /wp-content/themes/askapache/s
    Disallow: /trackback/
    Disallow: /*?*
    Disallow: */trackback/

    User-agent: Googlebot
    # disallow all files ending with these extensions
    Disallow: /*.php$
    Disallow: /*.js$
    Disallow: /*.inc$
    Disallow: /*.css$
    Disallow: /*.gz$
    Disallow: /*.cgi$
    Disallow: /*.wmv$
    Disallow: /*.png$
    Disallow: /*.gif$
    Disallow: /*.jpg$
    Disallow: /*.cgi$
    Disallow: /*.xhtml$
    Disallow: /*.php*
    Disallow: */trackback*
    Disallow: /*?*
    Disallow: /z/
    Disallow: /wp-*
    Allow: /wp-content/uploads/

    # allow google image bot to search all images
    User-agent: Googlebot-Image
    Allow: /*

    # allow adsense bot on entire site
    User-agent: Mediapartners-Google*
    Disallow: /*?*
    Allow: /z/
    Allow: /about/
    Allow: /contact/
    Allow: /wp-content/
    Allow: /tag/
    Allow: /manual/*
    Allow: /docs/*
    Allow: /*.php$
    Allow: /*.js$
    Allow: /*.inc$
    Allow: /*.css$
    Allow: /*.gz$
    Allow: /*.cgi$
    Allow: /*.wmv$
    Allow: /*.cgi$
    Allow: /*.xhtml$
    Allow: /*.php*
    Allow: /*.gif$
    Allow: /*.jpg$
    Allow: /*.png$

    # disallow archiving site
    User-agent: ia_archiver
    Disallow: /

    # disable duggmirror
    User-agent: duggmirror
    Disallow: /
     
  2. crysis1

    crysis1 Junior Member

    Joined:
    Apr 3, 2009
    Messages:
    100
    Likes Received:
    44
    bump :eek: lol anybody
     
  3. neo

    neo Power Member

    Joined:
    May 5, 2007
    Messages:
    500
    Likes Received:
    365
    If you put Disallow: /wp-content/themes/ the it will stop the spiders from crawling the entire themes directory. That means you don't have to enter,
    Disallow: /wp-content/themes/askapache/4 etc... again.
     
  4. keinehabe

    keinehabe Supreme Member

    Joined:
    Nov 4, 2008
    Messages:
    1,207
    Likes Received:
    472
    Gender:
    Male
    Occupation:
    -= CEO =-
    Location:
    Heaven
    Home Page:
    since when robots crawl the css and js files ?:O? and where's the point to disallow pages by extension ? since the wordpress htaccess file is so nicely made and make clean urls for the spyders ...