Looking for some help if anybody has up to date htaccess code for blocking all major site crawlers like Ahrefs and Majestic. This would be obviously helpful to avoid competitors digging into any pages you dont want to appear in your link profile.
User-agent: Rogerbot
User-agent: Exabot
User-agent: MJ12bot
User-agent: Dotbot
User-agent: Gigabot
User-agent: AhrefsBot
User-agent: BlackWidow
User-agent: Bot\ [EMAIL="[email protected]"]mailto:[email protected][/EMAIL]
User-agent: ChinaClaw
User-agent: Custo
User-agent: DISCo
User-agent: Download\ Demon
User-agent: eCatch
User-agent: EirGrabber
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: Express\ WebPictures
User-agent: ExtractorPro
User-agent: EyeNetIE
User-agent: FlashGet
User-agent: GetRight
User-agent: GetWeb!
User-agent: Go!Zilla
User-agent: Go-Ahead-Got-It
User-agent: GrabNet
User-agent: Grafula
User-agent: HMView
User-agent: HTTrack
User-agent: Image\ Stripper
User-agent: Image\ Sucker
User-agent: Indy\ Library
User-agent: InterGET
User-agent: Internet\ Ninja
User-agent: JetCar
User-agent: JOC\ Web\ Spider
User-agent: larbin
User-agent: LeechFTP
User-agent: Mass\ Downloader
User-agent: MIDown\ tool
User-agent: Mister\ PiX
User-agent: Navroad
User-agent: NearSite
User-agent: NetAnts
User-agent: NetSpider
User-agent: Net\ Vampire
User-agent: NetZIP
User-agent: Octopus
User-agent: Offline\ Explorer
User-agent: Offline\ Navigator
User-agent: PageGrabber
User-agent: Papa\ Foto
User-agent: pavuk
User-agent: pcBrowser
User-agent: RealDownload
User-agent: ReGet
User-agent: SiteSnagger
User-agent: SmartDownload
User-agent: SuperBot
User-agent: SuperHTTP
User-agent: Surfbot
User-agent: tAkeOut
User-agent: Teleport\ Pro
User-agent: VoidEYE
User-agent: Web\ Image\ Collector
User-agent: Web\ Sucker
User-agent: WebAuto
User-agent: WebCopier
User-agent: WebFetch
User-agent: WebGo\ IS
User-agent: WebLeacher
User-agent: WebReaper
User-agent: WebSauger
User-agent: Website\ eXtractor
User-agent: Website\ Quester
User-agent: WebStripper
User-agent: WebWhacker
User-agent: WebZIP
User-agent: Wget
User-agent: Widow
User-agent: WWWOFFLE
User-agent: Xaldon\ WebSpider
User-agent: Zeus
Disallow: /
SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
SetEnvIfNoCase User-Agent .*exabot.* bad_bot
SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>
No, the goal is to block backlinks which you don't want to appear on your money site link profile on all the Ahrefs, Majestic and other tools, that would keep you away from competitors noticing any bad links and reporting you to G.
But that would require you to have access to the sites linking to you.
Ahrefs/Majestic don't need to crawl your website in order to see the backlinks pointing to it as I understand it.
Ahrefs/Majestic don't need to crawl your website in order to see the backlinks pointing to it as I understand it.
Ahrefs/Majestic/OSE are all backlink checkers. They crawl the site in question and map the inbound links pointing to that site. They don't (currently) add to that data by adding in additional links that they know are outbound links on other sites they have crawled. Would be an interesting feature but AFAIK they don't do it yet.
As a result, the crawlers' results are limited to the site crawl for that domain. If the bots are blocked from the domain from the outset, they can't report links.
majesticstaff said:You are correct, you are at liberty to block our crawlers from visting your website we do respect the robots.txt standards. But this will only mean that we cannot collect information from your website, such as links to external sites, page titles, etc. Any information published about your website on other websites which we are permitted to crawl will be read and indexed.
Kind Regards
Chris
gorang, will you spend your day to explain the obvious?
Not everyone is a fan of web 2.0 blogs and social bookmarking. This is for those who want to block Ahrefs/Majestic and other bots from their private network sites.
If you're having a few web 2.0 blogs to your site, then that will show on Ahrefs. If you're having 10-15 high PR backlinks from a network that is private then you'll have the competition wondering how you're ranking.
Everyone figured it out...
It's not too hard to find networks, yes blocking bots on your network sites will deter a lazy seo, won't deter someone who knows what they are doing though and it isn't exactly hard work either.
I didn't seem like everyone figured it out, but yes I do already understand why people might want to block majestic and other crawlers from their networks. I already employ this myself.
User-agent: libwww-perl
User-agent: libwwwperl
User-agent: attach
User-agent: ASPSeek
User-agent: appie
User-agent: AbachoBOT
User-agent: autoemailspider
User-agent: anarchie
User-agent: antibot
User-agent: asterias
User-agent: B2w
User-agent: BackWeb
User-agent: BackDoorBot
User-agent: Bandit
User-agent: BatchFTP
User-agent: Black\ Hole
User-agent: Baidu
User-agent: BlowFish
User-agent: BuiltBotTough
User-agent: Bot\ mailto
User-agent: BotALot
User-agent: Buddy
User-agent: Bullseye
User-agent: bumblebee
User-agent: BunnySlippers
User-agent: ClariaBot
User-agent: curl
User-agent: clsHTTP
User-agent: ChinaClaw
User-agent: CheeseBot
User-agent: CherryPicker
User-agent: Crescent
User-agent: CherryPickerSE
User-agent: CherryPickerElite
User-agent: Collector
User-agent: COAST\ WebMaster
User-agent: cosmos
User-agent: CopyRightCheck
User-agent: ColdFusion
User-agent: Copier
User-agent: Crescent
User-agent: DA
User-agent: DTS\ Agent
User-agent: DISCo\ Pump
User-agent: DittoSpyder
User-agent: Diamond
User-agent: Download\ Demon
User-agent: Download\ Wonder
User-agent: Downloader
User-agent: dloader
User-agent: Drip
User-agent: eCatch
User-agent: EirGrabber
User-agent: Express\ WebPictures
User-agent: Extreme\ Picture\ Finder
User-agent: EmailCollector
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: EasyDL
User-agent: EirGrabber
User-agent: EroCrawler
User-agent: ExtractorPro
User-agent: EyeNetIE
User-agent: FAST\ WebCrawler
User-agent: FileHound
User-agent: Fetch\ API\ Request
User-agent: FlashGet
User-agent: FlickBot
User-agent: FrontPage
User-agent: FreeFind.com
User-agent: GetRight
User-agent: GetSmart
User-agent: Generic
User-agent: Go!Zilla
User-agent: Go-Ahead-Got-It
User-agent: gotit
User-agent: Grabber
User-agent: GrabNet
User-agent: Grafula
User-agent: Gulliver
User-agent: Harvest
User-agent: HMView
User-agent: Heretrix
User-agent: HitboxDoctor
User-agent: HTTPapp
User-agent: HTTrack
User-agent: HTTPTrack
User-agent: HTTPviewer
User-agent: httplib
User-agent: httpfetcher
User-agent: httpscraper
User-agent: hloader
User-agent: humanlinks
User-agent: ia_archiver
User-agent: InterGET
User-agent: Internet\ Ninja
User-agent: InfoNaviRobot
User-agent: InternetSeer.com
User-agent: Iria
User-agent: IRLbot
User-agent: JetCar
User-agent: JOC
User-agent: JOC\ Web\ Spider
User-agent: JoBo
User-agent: Java
User-agent: JustView
User-agent: Jonzilla
User-agent: JennyBot
User-agent: Kenjin\ Spider
User-agent: Keyword\ Density
User-agent: larbin
User-agent: LeechFTP
User-agent: Lachesis
User-agent: LexiBot
User-agent: libWeb
User-agent: Libby_
User-agent: LinkScan
User-agent: LinkWalker
User-agent: LinkextractorPro
User-agent: lftp
User-agent: likse
User-agent: Link
User-agent: lwp-trivial
User-agent: lwp\ request
User-agent: Magnet
User-agent: Mag-Net
User-agent: Mass\ Downloader
User-agent: MIIxpc
User-agent: Microsoft\ URL\ Control
User-agent: MSFrontPage
User-agent: MSIECrawler
User-agent: MicrosoftURL
User-agent: Missigua
User-agent: Mewsoft\ Search\ Engine
User-agent: moget
User-agent: Mata\ Hari
User-agent: Memo
User-agent: Metacarta
User-agent: Mercator
User-agent: MIDown\ tool
User-agent: MFC_Tear_Sample
User-agent: Mirror
User-agent: MIIxpc
User-agent: Mister\ PiX
User-agent: NationalDirectory\ WebSpider
User-agent: NICErsPRO
User-agent: Nikto
User-agent: Navroad
User-agent: NearSite
User-agent: NetAnts
User-agent: NetSpider
User-agent: NICErsPRO
User-agent: NetResearchServer
User-agent: NetMechanic
User-agent: Net\ Vampire
User-agent: Net\ Probe
User-agent: NetZip
User-agent: nexuscache
User-agent: Ninja
User-agent: NPBot
User-agent: our\ agent
User-agent: onestop
User-agent: oBot
User-agent: Octopus
User-agent: Offline\ Explorer
User-agent: Openfind
User-agent: Openfind\ data\ gatherer
User-agent: OrangeBot
User-agent: PageGrabber
User-agent: Papa\ Foto
User-agent: PHP\ version
User-agent: PHP
User-agent: PHPot
User-agent: Perl
User-agent: pcBrowser
User-agent: pavuk
User-agent: Pockey
User-agent: Ping
User-agent: PingALink\ Monitoring\ Services
User-agent: ProWebWalker
User-agent: ProPowerBot
User-agent: Pump
User-agent: Pompos
User-agent: psbot
User-agent: Python\ urllib
User-agent: Python-urllib
User-agent: QueryN
User-agent: RealDownload
User-agent: Reaper
User-agent: Recorder
User-agent: RepoMonkey
User-agent: psycheclone
User-agent: RMA
User-agent: Rico
User-agent: Robozilla
User-agent: ReGet
User-agent: Siphon
User-agent: SiteSnagger
User-agent: sitecheck.internetseer.com
User-agent: SmartDownload
User-agent: Snake
User-agent: spanner
User-agent: Stealer
User-agent: SpaceBison
User-agent: SpankBot
User-agent: Spinne
User-agent: Stripper
User-agent: slysearch
User-agent: Sucker
User-agent: Snoopy
User-agent: ScoutAbout
User-agent: Scooter
User-agent: SuperBot
User-agent: SuperHTTP
User-agent: Snapbot
User-agent: Surfbot
User-agent: suzuran
User-agent: Szukacz
User-agent: Sqworm
User-agent: tAkeOut
User-agent: Teleport\ Pro
User-agent: Telesoft
User-agent: TurnitinBot
User-agent: turingos
User-agent: toCrawl
User-agent: TightTwatBot
User-agent: True_Robot
User-agent: The\ Intraformant
User-agent: TheNomad
User-agent: Titan
User-agent: UrlDispatcher
User-agent: URLy\ Warning
User-agent: Vayala
User-agent: Vagabondo
User-agent: Vintage
User-agent: Vacuum
User-agent: VCI
User-agent: VoidEYE
User-agent: W3C_Validator
User-agent: Webdownloader
User-agent: Web\ Downloader
User-agent: Webhook
User-agent: Webmole
User-agent: Webminer
User-agent: Webmirror
User-agent: Websucker
User-agent: Websites
User-agent: Web\ Image\ Collector
User-agent: Web\ Sucker
User-agent: WebAuto
User-agent: WebCopier
User-agent: WebFetch
User-agent: WebReaper
User-agent: WebSauger
User-agent: Website
User-agent: Webster
User-agent: WebStripper
User-agent: WebCopier
User-agent: WebViewer
User-agent: WebWhacker
User-agent: WebEnhancer
User-agent: Wells
User-agent: WebZIP
User-agent: Wget
User-agent: Whacker
User-agent: Widow
User-agent: Xaldon
User-agent: Wildsoft\ Surfer
User-agent: WinHttpRequest
User-agent: WinHttp
User-agent: Webster\ Pro
User-agent: Web\ Image\ Collector
User-agent: WebZip
User-agent: WebAuto
User-agent: Website\ Quester
User-agent: WWWOFFLE
User-agent: WWW-Collector-E
User-agent: Xaldon\ WebSpider
User-agent: Xenu
User-agent: Xara
User-agent: Y!TunnelPro
User-agent: YahooYSMcm
User-agent: Zade
User-agent: ZBot
User-agent: Zeus
User-agent: Rogerbot
User-agent: Exabot
User-agent: MJ12bot
User-agent: Dotbot
User-agent: Gigabot
User-agent: AhrefsBot
User-agent: BlackWidow
User-agent: Bot\ mailto:[email protected]
User-agent: ChinaClaw
User-agent: Custo
User-agent: DISCo
User-agent: Download\ Demon
User-agent: eCatch
User-agent: EirGrabber
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: Express\ WebPictures
User-agent: ExtractorPro
User-agent: EyeNetIE
User-agent: FlashGet
User-agent: GetRight
User-agent: GetWeb!
User-agent: Go!Zilla
User-agent: Go-Ahead-Got-It
User-agent: GrabNet
User-agent: Grafula
User-agent: HMView
User-agent: HTTrack
User-agent: Image\ Stripper
User-agent: Image\ Sucker
User-agent: Indy\ Library
User-agent: InterGET
User-agent: Internet\ Ninja
User-agent: JetCar
User-agent: JOC\ Web\ Spider
User-agent: larbin
User-agent: LeechFTP
User-agent: Mass\ Downloader
User-agent: MIDown\ tool
User-agent: Mister\ PiX
User-agent: Navroad
User-agent: NearSite
User-agent: NetAnts
User-agent: NetSpider
User-agent: Net\ Vampire
User-agent: NetZIP
User-agent: Octopus
User-agent: Offline\ Explorer
User-agent: Offline\ Navigator
User-agent: PageGrabber
User-agent: Papa\ Foto
User-agent: pavuk
User-agent: pcBrowser
User-agent: RealDownload
User-agent: ReGet
User-agent: SiteSnagger
User-agent: SmartDownload
User-agent: SuperBot
User-agent: SuperHTTP
User-agent: Surfbot
User-agent: tAkeOut
User-agent: Teleport\ Pro
User-agent: VoidEYE
User-agent: Web\ Image\ Collector
User-agent: Web\ Sucker
User-agent: WebAuto
User-agent: WebCopier
User-agent: WebFetch
User-agent: WebGo\ IS
User-agent: WebLeacher
User-agent: WebReaper
User-agent: WebSauger
User-agent: Website\ eXtractor
User-agent: Website\ Quester
User-agent: WebStripper
User-agent: WebWhacker
User-agent: WebZIP
User-agent: Wget
User-agent: Widow
User-agent: WWWOFFLE
User-agent: Xaldon\ WebSpider