1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

quote I liked about unauthorized bots

Discussion in 'Black Hat SEO' started by mainceaft, May 25, 2014.

Tags:
  1. mainceaft

    mainceaft Regular Member

    Joined:
    Apr 10, 2013
    Messages:
    373
    Likes Received:
    39
    Hi
    This guy looks mad from all bots crawling his site so put this message on robots.txt

    #
    # Please, we do NOT allow non authorized robots any longer.
    #
    #Copyright owners have the legal right under the DMCA to reserve the right to view
    # content only to website visitors. Webmasters have the legal right under DMCA to
    # block access to anyone who wants to store or copy website content. It is also a
    # crime under US law to use any trick or false information to gain access to a
    # computer system. Running a robot that pretends to be a user by faking its
    # user agent is crime under US Law because it is using false information to gain
    # access to a computer system.
    followed with long list of banned UA
    Code:
    User-agent: Nutch
    Disallow: /
    
    User-agent: NASA Search
    Disallow: /
    
    
    User-agent: Check&Get
    Disallow: /
    
    User-agent: AnsearchBot
    Disallow: /
    
    User-agent: edgeio-retriever
    Disallow: /
    
    User-agent: HTTP::Lite
    Disallow: /
    
    User-agent: yacybot
    Disallow: /
    
    User-agent: TrackBack/1.02
    Disallow: /
    
    User-agent: MJ12bot/v1.2.0 
    Disallow: /
    
    User-agent: MJ12bot
    Disallow: /
    
    User-agent: OutfoxMelonBot
    Disallow: /
    
    User-agent: wwwster/1.4
    Disallow: /
    
    User-agent: Girafabot
    Disallow: /
    
    User-agent: voyager/1.0
    Disallow: /
    
    User-agent: SumeetBot 
    Disallow: /
    
    User-agent: ISC Systems iRc Search 2.1
    Disallow: /
    
    User-agent: OpenDNS Domain Crawler
    Disallow: /
    
    User-agent: Entrieva/1.0
    Disallow: /
    
    User-agent: Entrieva
    Disallow: /
    
    User-agent: MVAClient 
    Disallow: /
    
    User-agent: DataCha0s
    Disallow: /
    
    User-agent: DataCha0s/2.0
    Disallow: /
    
    User-agent: larbin
    Disallow: /
    
    User-agent: larbin_2.6.3
    Disallow: /
    
    User-agent: Twiceler
    Disallow: /
    
    User-agent: e-SocietyRobot
    Disallow: /
    
    User-agent: NextGenSearchBot 1
    Disallow: /
    
    User-agent: NextGenSearchBot
    Disallow: /
    
    User-agent: findlinks
    Disallow: /
    
    User-agent: CazoodleBot
    Disallow: /
    
    User-agent: Furl Search 2.0
    Disallow: /User-agent: FurlBot
    Disallow: /
    User-agent: DepSpid
    Disallow: /
    
    User-agent: DepSpid/5.03
    Disallow: /
    
    User-agent: DataCha0s/2.0
    Disallow: /
    
    User-agent: OutfoxBot/0.5
    Disallow: /
    
    User-agent: IlseBot/1.0
    Disallow: /
    
    User-agent: ShopWiki/1.0
    Disallow: /
    
    User-agent: Filangy/1.01
    Disallow: /
    
    User-agent: Filangy
    Disallow: /
    
    User-agent: POE-Component-Client-HTTP/0.65
    Disallow: /
    
    User-agent: SurveyBot/2.3
    Disallow: /
    
    User-agent: ConveraCrawler/0.9d
    Disallow: /
    
    User-agent: nicebot
    Disallow: /
    
    User-agent: Yandex/1.01.001
    Disallow: /
    
    User-agent: msnbot-media/1.0
    Disallow: /
    
    User-agent: msnbot-Products/1.0
    Disallow: /
    
    User-agent: Factbot 1.09
    Disallow: /
    
    User-agent: AboutUsBot/0.9
    Disallow: /
    
    User-agent: AnsearchBot/1.0
    Disallow: /
     
    User-agent: EmailSiphon
    Disallow: /
    
    User-agent: noxtrumbot/1.0
    Disallow: /
    
    User-agent: ISC Systems iRc Search 2.1
    Disallow: /
    
    User-agent: ISC Systems
    Disallow: /
    
    User-agent: Mozilla/4.0
    Disallow: /
    
    User-agent: miniRank/2.0
    Disallow: /
    
    User-agent: Wells Search II
    Disallow: /
    
    User-agent: Exabot/3.0
    Disallow: /
    
    User-agent: asterias/2.0
    Disallow: /
    
    User-agent: asterias
    Disallow: /
    
    User-agent: EmailWolf
    Disallow: /
    
    User-agent: ExtractorPro
    Disallow: /
    
    User-agent: CherryPicker
    Disallow: /
    
    User-agent: NICErsPRO
    Disallow: /
    
    User-agent: Teleport
    Disallow: /
    
    User-agent: EmailCollector
    Disallow: /
    
    User-agent: LinkWalker
    Disallow: /
    
    User-agent: Zeus
    Disallow: /
    
    User-agent: URL_Spider_Pro
    Disallow: /
    
    User-agent: WebBandit
    Disallow: /
    
    User-agent: ExtractorPro
    Disallow: /
    
    User-agent: CopyRightCheck
    Disallow: /
    
    User-agent: Crescent
    Disallow: /
    
    User-agent: SiteSnagger
    Disallow: /
    
    User-agent: ProWebWalker
    Disallow: /
    
    User-agent: CheeseBot
    Disallow: /
    
    User-agent: LNSpiderguy
    Disallow: /
    
    User-agent: Black Hole
    Disallow: /
    
    User-agent: Titan
    Disallow: /
    
    User-agent: WebStripper
    Disallow: /
    
    User-agent: NetMechanic
    Disallow: /
    
    User-agent: TeleportPro
    Disallow: /
    
    User-agent: MIIxpc
    Disallow: /
    
    User-agent: Telesoft
    Disallow: /
    
    User-agent: Website Quester
    Disallow: /
    
    User-agent: WebZip
    Disallow: /
    
    User-agent: moget/2.1
    Disallow: /
    
    User-agent: WebZip/4.0
    Disallow: /
    
    User-agent: WebSauger
    Disallow: /
    
    User-agent: WebCopier
    Disallow: /
    
    User-agent: NetAnts
    Disallow: /
    
    User-agent: Mister PiX
    Disallow: /
    
    User-agent: WebAuto
    Disallow: /
    
    User-agent: TheNomad
    Disallow: /
    
    User-agent: WWW-Collector-E
    Disallow: /
    
    User-agent: libWeb/clsHTTP
    Disallow: /
    
    User-agent: httplib
    Disallow: /
    
    User-agent: turingos
    Disallow: /
    
    User-agent: spanner
    Disallow: /
    
    User-agent: InfoNaviRobot
    Disallow: /
    
    User-agent: Harvest/1.5
    Disallow: /
    
    User-agent: Bullseye/1.0
    Disallow: /
    
    User-agent: BullsEye
    Disallow: /
    
    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    Disallow: /
    
    User-agent: CherryPickerSE/1.0
    Disallow: /
    
    User-agent: CherryPickerElite/1.0
    Disallow: /
    
    User-agent: WebBandit/3.50
    Disallow: /
    
    User-agent: NICErsPRO
    Disallow: /
    
    User-agent: Microsoft URL Control - 5.01.4511
    Disallow: /
    
    User-agent: DittoSpyder
    Disallow: /
    
    User-agent: Foobot
    Disallow: /
    
    User-agent: WebmasterWorldForumBot
    Disallow: /
    
    User-agent: SpankBot
    Disallow: /
    
    User-agent: BotALot
    Disallow: /
    
    User-agent: lwp-trivial/1.34
    Disallow: /
    
    User-agent: lwp-trivial
    Disallow: /
    
    User-agent: Wget/1.6
    Disallow: /
    
    User-agent: BunnySlippers
    Disallow: /
    
    User-agent: Microsoft URL Control - 6.00.8169
    Disallow: /
    
    User-agent: URLy Warning
    Disallow: /
    
    User-agent: Wget/1.5.3
    Disallow: /
    
    User-agent: cosmos
    Disallow: /
    
    User-agent: moget
    Disallow: /
    
    User-agent: hloader
    Disallow: /
    
    User-agent: humanlinks
    Disallow: /
    
    User-agent: LinkextractorPro
    Disallow: /
    
    User-agent: Offline Explorer
    Disallow: /
    
    User-agent: Mata Hari
    Disallow: /
    
    User-agent: LexiBot
    Disallow: /
    
    User-agent: Web Image Collector
    Disallow: /
    
    User-agent: The Intraformant
    Disallow: /
    
    User-agent: True_Robot/1.0
    Disallow: /
    
    User-agent: True_Robot
    Disallow: /
    
    User-agent: BlowFish/1.0
    Disallow: /
    
    User-agent: JennyBot
    Disallow: /
    
    User-agent: MIIxpc/4.2
    Disallow: /
    
    User-agent: BuiltBotTough
    Disallow: /
    
    User-agent: ProPowerBot/2.14
    Disallow: /
    
    User-agent: BackDoorBot/1.0
    Disallow: /
    
    User-agent: toCrawl/UrlDispatcher
    Disallow: /
    
    User-agent: WebEnhancer
    Disallow: /
    
    User-agent: TightTwatBot
    Disallow: /
    
    User-agent: suzuran
    Disallow: /
    
    User-agent: VCI WebViewer VCI WebViewer Win32
    Disallow: /
    
    User-agent: VCI
    Disallow: /
    
    User-agent: Szukacz/1.4
    Disallow: /
    
    User-agent: QueryN Metasearch
    Disallow: /
    
    User-agent: Openfind data gathere
    Disallow: /
    
    User-agent: Openfind
    Disallow: /
    
    User-agent: Xenu's Link Sleuth 1.1c
    Disallow: /
    
    User-agent: Xenu's
    Disallow: /
    
    User-agent: RepoMonkey Bait & Tackle/v1.01
    Disallow: /
    
    User-agent: RepoMonkey
    Disallow: /
    
    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    Disallow: /
    
    User-agent: Webster Pro
    Disallow: /
    
    User-agent: EroCrawler
    Disallow: /
    
    User-agent: LinkScan/8.1a Unix
    Disallow: /
    
    User-agent: Keyword Density/0.9
    Disallow: /
    
    User-agent: Kenjin Spider
    Disallow: /
    
    User-agent: Cegbfeieh
    Disallow: /
    
    User-agent: WebReaper
    Disallow: /User-agent: BecomeBot/1.23
    Disallow: /
    
    User-agent: lwp-trivial/1.34
    Disallow: /
    
    User-agent: findlinks/0.89
    Disallow: /
    
    User-agent: psbot
    Disallow: /
    
    User-agent: psbot/0.1
    Disallow: /
    
    User-agent: wbdbot
    Disallow: /
    
    User-agent: WEP Search 00
    Disallow: /
    
    User-agent: Missigua Locator 1.9
    Disallow: /
    
    User-agent: River Valley Inc
    Disallow: /
    
    User-agent: f-bot test pilot
    Disallow: /
    
    User-agent: Program Shareware 1.0.0
    Disallow: /
    
    User-agent: Java/1.5.0
    Disallow: /
    
    User-agent: WebCopier v4.1
    Disallow: /
    
    User-agent: 8484 Boston Project v 1.0
    Disallow: /
    
    User-agent: Calif Univ Tools
    Disallow: /
    
    User-agent: Port Huron Labs
    Disallow: /
    
    User-agent: Green Research, Inc.
    Disallow: /
    
    User-agent: JoBo/1.3
    Disallow: /
    
    User-agent: Franklin Locator 1.8
    Disallow: /
    
    User-agent: Mozilla/3.0 (compatible; Indy Library)
    Disallow: /
    
    User-agent: Rvldsmcmuwduwxnltyrm x snl
    Disallow: /
    
    User-agent: gqbi hnxupsxgfgnX berXjteu
    Disallow: /
    
    User-agent: Scooter-3.0.FS
    Disallow: /
    
    User-agent: Java/1.4.2_04
    Disallow: /
    
    User-agent: Java1.3.1_08
    Disallow: /
    
    User-agent: Microsoft URL Control - 6.00.8862
    Disallow: /
    
    User-agent: christcrawler
    Disallow: /
    
    User-agent: BaySpider
    Disallow: /
    
    User-agent: NHSEWalker/3.0
    Disallow: /
    
    User-agent: ContextAd Bot 1.0
    Disallow: /
    
    User-agent: Openbot/3.0+
    Disallow: /
    
    User-agent: NPBot
    Disallow: /
    
    User-agent: Mediapartners-Google/2.1
    Disallow: /
    
    User-agent: NaverBot-1.0
    Disallow: /
    
    User-agent: WebCapture 3.0
    Disallow: /
    
    User-agent: MSProxy/2.0
    Disallow: /
    
    User-agent: Mozilla/PICgrabber
    Disallow: /
    
    User-agent: Zao-Crawler 0.1b
    Disallow: /
    
    User-agent: Wget
    Disallow: /
    
    User-agent: Wget/1.9+cvs-stable
    Disallow: /
    
    User-agent: ConveraCrawler/0.9d
    Disallow: /
    
    User-agent: nicebot
    Disallow: /
    
    User-agent: Yandex/1.01.001
    Disallow: /
    
    User-agent: Factbot 1.09
    Disallow: /
    
    User-agent: WISENutbot
    Disallow: /
    
    User-agent: MSNPTC/1.0
    Disallow: /
    
    User-agent: Nomad
    Disallow: /
    
    User-agent: EchO!
    Disallow: /
    
    User-agent: Walhello appie
    Disallow: /
    
    User-agent: Nomad
    Disallow: /
    
    User-agent: Scooter
    Disallow: /
    
    User-agent: NaverBot-1.0
    Disallow: /
    
    User-agent: NaverBot
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; Cerberian Drtrs Version-3.1-Build-17)
    Disallow: /
    
    User-agent: Cerberian
    Disallow: /
    
    User-agent: ichiro/1.0 (ichiro@nttr.co.jp)
    Disallow: /
    
    User-agent: HTTrack
    Disallow: /
    
    User-agent: HTTrack 3.0x
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) VoilaBot BETA 1.2 (http://www.voila.com/)
    Disallow: /
    
    User-agent: CustomExchangeBrowser
    Disallow: /
    
    User-agent: VoilaBot
    Disallow: /
    
    User-agent: Voila
    Disallow: /
    
    User-agent: ichiro
    Disallow: /
    
    User-agent: Anonymization.Net Vivaldi
    Disallow: /
    
    User-agent: Vivaldi
    Disallow: /
    
    User-agent: Wysigot 5.3
    Disallow: /
    
    User-agent: Wysigot
    Disallow: /
    
    User-agent: Wysigot 5.5
    Disallow: /
    
    User-agent: Anonymization.Net
    Disallow: /
    
    User-agent: SiteSnagger
    Disallow: /
    
    User-agent: Wotbox
    Disallow: /
    
    User-agent: Wotbox/0.7
    Disallow: /
    
    User-agent: Wotbox/0.7-alpha
    Disallow: /
    
    User-agent: Googlebot-Image/1.0
    Disallow: /
    
    User-agent: Googlebot-Image
    Disallow: /
    
    User-agent: Microsoft Data Access Internet Publishing Provider Protocol Discovery
    Disallow: /
    
    User-agent: OmniExplorer_Bot/1.07
    Disallow: /
    
    User-agent: Mozilla/5.0 (compatible; +http://myhome.net)
    Disallow: /
    
    User-agent: Accoona-AI-Agent
    Disallow: /
    
    User-agent: Accoona-AI-Agent/1.1.2 (aicrawler at accoonabot dot com) 
    Disallow: /
    User-agent: mozilla/5.0 (compatible; heritrix/1.3.0 +http://crawler.archive.org)
    Disallow: /
    
    User-agent: sproose/1.0beta
    Disallow: /
    
    User-agent: sproose
    Disallow: /
    for robots.txt he can short all this with
    User-agent: *
    Disallow: /
    and allow other user agent like googlebot and yahoo bing .

    I know the list are long and most of those bots not honor robots.txt
    so .htaccess i the best solution

    Code:
    RewriteEngine On
    
    RewriteCond %{HTTP_USER_AGENT} "AhrefsBot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "Ahrefs" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "rogerbot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "MJ12bot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "majestic12" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "MJ12" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "SiteBot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "Semrush" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "SiteExplorer" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "SEOkicks-Robot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "linkdexbot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "BLEXbot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "Blekkobot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "exabot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "dotbot" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "BCKLINKS" [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} "InternetSeer.com" 
    RewriteRule ^.* - [F,L]
    
    
    
    ^^^ The Last code is mine . and I not thing well be any footprint with this as you have all right to ban all unwanted bots from crawling your site .
     
    Last edited: May 25, 2014