1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How do i stop bots going to my site

Discussion in 'Black Hat SEO' started by Julien1, Apr 11, 2011.

  1. Julien1

    Julien1 Junior Member

    Joined:
    May 28, 2009
    Messages:
    158
    Likes Received:
    3
    Occupation:
    Marketing
    Location:
    USA
    I have a bh site and I dont want any crawlers coming to my site. Can I just block the ip's of bots?
     
  2. Skygrinder

    Skygrinder Newbie

    Joined:
    Dec 10, 2009
    Messages:
    11
    Likes Received:
    1
    Occupation:
    Web Developer
    Location:
    Athens
    Google "robots.txt"
     
  3. cashking009

    cashking009 Junior Member

    Joined:
    Jul 31, 2009
    Messages:
    188
    Likes Received:
    41
    add this meta tag to ur page
    HTML:
    <html>
    <head>
    <title>...</title>
    <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
    </head>
     
  4. sw1344

    sw1344 Newbie

    Joined:
    Nov 14, 2010
    Messages:
    22
    Likes Received:
    11
    Are you talking abiut the legitinate SEO bots like GoogleBots, Yahoo Slurp etc, or are you refering to scraper bots that are just scraping all your content.

    If the first, then the suggestions made above are great and will work.

    If the second lot, then robots.txt instructions are just gonna get ignored.

    The best way wouyld be to put some code on your site that checked for UserAgent (yes, i know its not perfect), and if there is no user agent specified then redirect them somewhere else.

    I'm sure you can get rind if a lot of rubbish bots this way.

    Hope this helps
     
  5. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    robots.txt can be easily circumvented.

    You would need to add something similar to your .htaccess file:
    Code:
    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} ^.*(AdsBot-Google|ia_archiver|Scooter|Ask.Jeeves|Baiduspider|Exabot|FAST.Enterprise.Crawler|FAST-WebCrawler|www\.neomo\.de|Gigabot|Mediapartners-Google|Google.Desktop|Feedfetcher-Google|Googlebot|heise-IT-Markt-Crawler|heritrix|ibm.com\cs/crawler|ICCrawler|ichiro|MJ12bot|MetagerBot|msnbot-NewsBlogs|msnbot|msnbot-media|NG-Search|lucene.apache.org|NutchCVS|OmniExplorer_Bot|online.link.validator|psbot0|Seekbot|Sensis.Web.Crawler|SEO.search.Crawler|Seoma.\[SEO.Crawler\]|SEOsearch|Snappy|www.urltrends.com|www.tkl.iis.u-tokyo.ac.jp/~crawler|SynooBot|crawleradmin.t-info@telekom.de|TurnitinBot|voyager|W3.SiteSearch.Crawler|W3C-checklink|W3C_Validator|www.WISEnutbot.com|yacybot|Yahoo-MMCrawler|Yahoo\!.DE.Slurp|Yahoo\!.Slurp|YahooSeeker).* [NC]
    RewriteRule .* - [F]
     
    • Thanks Thanks x 1