1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

robots.txt

Discussion in 'White Hat SEO' started by ashee1, Jun 16, 2013.

  1. ashee1

    ashee1 Newbie

    Joined:
    Jun 11, 2012
    Messages:
    8
    Likes Received:
    0
    What is meant by Optimization robots.txt .Can any one explain it in simple words.
    Thanks

     
  2. scorpion king

    scorpion king Senior Member

    Joined:
    May 2, 2010
    Messages:
    1,157
    Likes Received:
    2,393
    Occupation:
    Entrepreneur
    Location:
    irc.blackhatworld.com
    roborts.txt is to inform search engine spiders not to crawl certain pages. If search engine spiders crawl a web page it will be displayed in search engines for general public. People use roborts.txt option to block some personalized/private content.
     
  3. echizen

    echizen Newbie

    Joined:
    May 26, 2012
    Messages:
    45
    Likes Received:
    13
    Home Page:
    below an example optimized robots.txt for my wordpress site (just my own perspective :D)

    any suggestion or better advice, let me know
     
  4. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,290
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
    I never use robots.txt. On new sites i create a blank one just to avoid all those 404 error being logged in my apache logs
     
  5. YenSync

    YenSync Newbie

    Joined:
    Nov 23, 2011
    Messages:
    40
    Likes Received:
    116
    Occupation:
    too busy to be looking for it
    Location:
    Spain
    Home Page:
    You can do some nifty things with robots.txt.
    One of my sites has a lot of dynamic pages that are updated frequently.
    This all goes automatically. It's a mock-up of different api's.
    1000's of urls, fresh content, the spiders go crawling like crazy.
    Result? Api rate limits exceded.
    (Yes, even with cached results. These damn spiders sometimes!)
    So I added this to robots.txt:
    Code:
    User-agent: *
    Request-rate: 1/3 # Maximum request rate: 1 page every 3 seconds (20/minute)
    Crawl-delay: 3 # Used by several bots (20/minute)
    
    Problem solved.
    I also disallow a few bots.
    But be carefull with robots.txt.
    Especially if you don't know what you are doing.
    Before you know it, you disallow every spider.
    And you probably don't want that :)