1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Question from noob about robots !

Discussion in 'Black Hat SEO' started by HappyS, Jan 13, 2014.

  1. HappyS

    HappyS Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 15, 2013
    Messages:
    428
    Likes Received:
    63
    Gender:
    Male
    I have a CPA website with a fake crack stuff, and I am confused with robots.txt file. Should I use it, and what should I put into it ?
    Any tips are welcomed.
     
  2. Tealover

    Tealover Junior Member

    Joined:
    Aug 5, 2013
    Messages:
    181
    Likes Received:
    65
    Basically, if you want your site to be indexed completely by search engines you may leave it empty.
    But you should remember that most of CMS may duplicate your post into subcategories like /author or something like that.
    For example, for wordpress blog I would recomend to use
    User-agent: * (this line mean that any directions bellow will be used by any crawler)
    Allow: /wp-content/uploads/ (allow crawlers to index folder wp-content/upload)
    Disallow: /wp-login.php (disallow to index login page)
    Disallow: /wp-register.php (disallow to index registration page)
    Disallow: /xmlrpc.php (technical folder)
    Disallow: /template.html (disallow to index folder with templates)
    Disallow: /cgi-bin (technical folder)
    Disallow: /wp-admin (disallow to index admin login page)
    Disallow: /wp-includes (technical folder)
    Disallow: /wp-content/plugins (technical folder)
    Disallow: /wp-content/cache (technical folder)
    Disallow: /wp-content/themes (technical folder)
    Disallow: */trackback (duplicated folder)
    Disallow: */feed (duplicated folder)
    Disallow: */comments (duplicated folder)
    Disallow: */comment-page* (duplicated folder)
    Disallow: */replytocom= (duplicated folder)
    Disallow: /author* (duplicated folder)
    Disallow: */?author=* (duplicated folder)
    Disallow: */tag (duplicated folder)
    Disallow: /?feed= (disallow feed of your blog)
    Disallow: /?s= (disallow search)
    Disallow: /?se= (disallow search)


    Host: site.ru - (main mirror of your site)
    Sitemap: http://site.ru/sitemap.xml (link to your sitemap.xml)
     
    • Thanks Thanks x 5