1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Search Engine Cloaker IP List // Block bad bots.

Discussion in 'Cloaking and Content Generators' started by carlosn, Jan 26, 2016.

  1. carlosn

    carlosn Newbie

    Joined:
    Jan 28, 2011
    Messages:
    44
    Likes Received:
    7
    Add the code below to your .htaccess file to forbid access to unwanted spiders. These are UA unwanted spiders. This code will also block visitors with empty UA and/or referrer. The first two UAS are from Wayback machine. If you want your site to show on Wayback machine delete: .*archive\.org_bot.*|.*Wayback\ Machine\ Live\ Record.* and start that line with .*[Dd]isco.*. Make sure no pipe "|" is before .*[Dd]isco.*.

    SEC IP list is below this code.

    If you want to help me catch spider IPs please send me a PM, and I will provide an image link that you can add to any site you own, and I will updated the list here. Or, I can send the updated list via email as a newsletter to those who want to subscribe.



    Code:
    ################################################BLOCK BAD BOTS BELOW
    RewriteCond %{HTTP_USER_AGENT} .*archive\.org_bot.*|.*Wayback\ Machine\ Live\ Record.*|.*[Dd]isco.*|.*[Jj]ava.*|.*[Nn]inja.*|.*[Nn]utch.*|.*[Ww]eb[Bb]andit.*|.*[Xx]enu.*|.*[Zz]eus.*|.*[Zz]yborg.*|.*360Spider.*|.*aboutthedomain.*|.*AhrefsBot.*|.*aiHitBot.*|.*almaden.*|.*Anarchie.*|.*ASPSeek.*|.*attach.*|.*autoemailspider.*|.*BackWeb.*|.*Bandit.*|.*BatchFTP.*|.*becomebot.*|.*BlackWidow.*|.*Blekkobot.*|.*Bot\ mailto\:craftbot\@yahoo\.com.*|.*BPImageWalker.*|.*Buddy.*|.*bumblebee.*|.*CCBot.*|.*CherryPicker.*|.*ChinaClaw.*|.*CICC.*|.*ColdFusion.*|.*Collector.*|.*Copier.*|.*CRAZYWEBCRAWLER.*|.*Crescent.*|.*curl.*|.*Custo.*|.*DA.*|.*dfbot.*|.*DigExt.*|.*DIIbot.*|.*DotBot.*|.*Download\ (Demon|Wonder).*|.*Downloader.*|.*Drip.*|.*DSurf15a.*|.*EasyDL\/2\.99.*|.*eCatch.*|.*EirGrabber.*|.*email.*|.*EmailCollector.*|.*EmailSiphon.*|.*EmailWolf.*|.*Exabot.*|.*Express\ WebPictures.*|.*ExtractorPro.*|.*EyeNetIE.*|.*facebookexternalhit.*|.*fastbot.*|.*FatBot.*|.*FileHound.*|.*FlashGet.*|.*FrontPage.*|.*fujilabolx1.*|.*GetRight.*|.*GetSmart.*|.*GetWeb\!.*|.*gigabaz.*|.*Go\!Zilla.*|.*Go\-Ahead\-Got\-It.*|.*gotit.*|.*GrabNet.*|.*Grafula.*|.*grub.*|.*HaosouSpider.*|.*HMView.*|.*HttpClient.*|.*httpdown.*|.*httrack.*|.*HTTrack.*|.*HubSpot.*|.*ia_archiver.*|.*ICC\-Crawler.*|.*Image\ Stripper.*|.*Image\ Sucker.*|.*Indy\ Library.*|.*InterGET.*|.*Internet\ Ninja.*|.*InternetLinkagent.*|.*InternetSeer.com.*|.*interseek.*|.*Iria.*|.*Jakarta.*|.*JBH*agent.*|.*JetCar.*|.*JOC\ Web\ Spider.*|.*JustView.*|.*kakaotalk\-scrap.*|.*Konqueror.*|.*Kumo.*|.*larbin.*|.*LeechFTP.*|.*LexiBot.*|.*lftp.*|.*libcurl.*|.*libwww\-perl.*|.*likse.*|.*Link*Sleuth.*|.*Link.*|.*LinkWalker.*|.*Lipperhey\-Kaus\-Australis\/5\.0.*|.*lwp.*|.*LWP\:\:Simple.*|.*lwp\-trivial.*|.*Mag\-Net.*|.*Magnet.*|.*Mail\.RU.*|.*Mass\ Downloader.*|.*MaxPointCrawler.*|.*Mechanize.*|.*MegaIndex.ru\/2\.0.*|.*Memo.*|.*Microsoft.URL.*|.*MIDown\ tool.*|.*Mirror.*|.*Mister\ PiX.*|.*MJ12bot.*|.*MSIECrawler.*|.*MozillaIndy.*|.*MozillaNEWT.*|.*MS\ FrontPage*.*|.*MSFrontPage.*|.*MSIECrawler.*|.*MSProxy.*|.*Navroad.*|.*NearSite.*|.*Net\ Vampire.*|.*NetAnts.*|.*NetcraftSurveyAgent.*|.*netEstate.*|.*NetMechanic.*|.*NetSpider.*|.*NetZIP.*|.*NICErsPRO.*|.*nutch.*|.*Octopus.*|.*Offline\ (Explorer|Navigator).*|.*OMozilla.*|.*Openfind.*|.*PageGrabber.*|.*Papa\ Foto.*|.*pavuk.*|.*pcBrowser.*|.*PHP.*|.*Ping.*|.*PingALink.*|.*Pockey.*|.*pogodak.*|.*Powermarks.*|.*psbot.*|.*Pump.*|.*Python.*|.*QRVA.*|.*RCrawler/2\.0.*|.*RealDownload.*|.*Reaper.*|.*Recorder.*|.*ReGet.*|.*Ruby.*|.*scooter.*|.*ScoutJet.*|.*Screaming\ Frog\ SEO\ Spider.*|.*Seeker.*|.*SemrushBot.*|.*SEOkicks\-Robot.*|.*sidewinder.*|.*Siphon.*|.*sitecheck.internetseer.com.*|.*SiteSnagger.*|.*SlySearch.*|.*SmartDownload.*|.*Snake.*|.*Sogou\ web\ spider.*|.*SpaceBison.*|.*spbot.*|.*spbot/4\.4\.2.*|.*spider.*|.*sproose.*|.*Stratagems.*|.*Stripper.*|.*Sucker.*|.*SuperBot.*|.*SuperHTTP.*|.*Surfbot.*|.*SurveyBot.*|.*Szukacz.*|.*taiil.*|.*tAkeOut.*|.*Teleport\ Pro.*|.*tridentspider.*|.*Ubuntu.*|.*URLSpiderPro.*|.*Vacuum.*|.*VoidEYE.*|.*W3C_Validator.*|.*WBSearchBot.*|.*WBSearchBot\/1\.1.*|.*Web\ Downloader.*|.*Web\ Image\ Collector.*|.*Web\ Sucker.*|.*WebAuto.*|.*WebCapture.*|.*webcollage.*|.*WebCopier.*|.*WebEMailExtrac.*|.*WebFetch.*|.*WebGo\ IS.*|.*WebHook.*|.*WebLeacher.*|.*WebMirror.*|.*WebReaper.*|.*WebSauger.*|.*Website.*|.*Website\ (eXtractor|Quester).*|.*Webster.*|.*WebStripper.*|.*WebWhacker.*|.*WebZIP.*|.*Wget.*|.*Whacker.*|.*Widow.*|.*WinHttp\.WinHttpRequest\.5.*|.*woobot.*|.*Wotbox.*|.*WWWOFFLE.*|.*x\-Tractor.*|.*Xaldon\ WebSpider.*|.*XoviBot\/2\.0.*|.*YodaoBot.*|.*ZeusWebster.*
    RewriteCond %{HTTP_REFERER} ^$ [NC]
    RewriteCond %{HTTP_USER_AGENT} ^$ [NC]
    RewriteRule ^.* - [F,L]
    ################################################BLOCK BAD BOTS ABOVE
    
    Below is a list of Search Enginge Cloaker IPs in Regex format. I decided to use Regex because it uses less lines. I had like 1500 Ips listed before, now I only have around 250, and SEC runs faster.

    Code:
    209.73.(19[01]|1[6-8][0-9]).[0-9]{1,3} #Altavista
    216.39.(6[0-3]|5[0-9]|4[89]).[0-9]{1,3} #Altavista
    111.13.102.[0-9]{1,3} #Baidu
    119.63.19[2-9].[0-9]{1,3} #Baidu
    119.63.196.[0-9]{1,3} #Baidu
    119.63.199.[0-9]{1,3} #Baidu
    122.81.(21[01]|20[89]).[0-9]{1,3} #Baidu
    123.125.67.(15[01]|14[4-9]) #Baidu
    123.125.67.15[23] #Baidu
    123.125.68.[0-9]{1,3} #Baidu
    123.125.68.(7[01]|6[89]) #Baidu
    123.125.68.7[2-9] #Baidu
    123.125.68.(9[0-5]|8[0-9]) #Baidu
    123.125.68.8[0-3] #Baidu
    123.125.68.8[45] #Baidu
    123.125.68.9[6-9] #Baidu
    123.125.71.[0-9]{1,3} #Baidu
    125.39.7[89].[0-9]{1,3} #Baidu
    180.76.[0-9]{1,3}.[0-9]{1,3} #Baidu
    180.76.15.[0-9]{1,3} #Baidu
    180.76.4.[0-9]{1,3} #Baidu
    180.76.5.[0-9]{1,3} #Baidu
    180.76.6.[0-9]{1,3} #Baidu
    185.10.104.[0-9]{1,3} #Baidu
    202.46.(6[0-3]|[45][0-9]|3[2-9]).[0-9]{1,3} #Baidu
    202.46.(6[0-3]|5[0-9]|4[89]).[0-9]{1,3} #Baidu
    220.181.108.[0-9]{1,3} #Baidu
    220.181.38.[0-9]{1,3} #Baidu
    220.181.51.[0-9]{1,3} #Baidu
    61.135.169.[0-9]{1,3} #Baidu
    203.208.(6[0-3]|[45][0-9]|3[2-9]).[0-9]{1,3} #Google
    203.208.60.[0-9]{1,3} #Google
    209.85.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]).[0-9]{1,3} #Google
    209.85.238.[0-9]{1,3} #Google
    216.239.(6[0-3]|[45][0-9]|3[2-9]).[0-9]{1,3} #Google
    216.3[2-5].[0-9]{1,3}.[0-9]{1,3} #Google
    64.233.(19[01]|1[6-8][0-9]).[0-9]{1,3} #Google
    64.68.8[0-7].[0-9]{1,3} #Google
    66.249.(9[0-5]|[78][0-9]|6[4-9]).[0-9]{1,3} #Google
    66.249.(7[0-9]|6[4-9]).[0-9]{1,3} #Google
    66.249.90.[0-9]{1,3} #Google
    66.249.91.[0-9]{1,3} #Google
    66.249.92.[0-9]{1,3} #Google
    72.14.(25[0-5]|2[0-4][0-9]|19[2-9]).[0-9]{1,3} #Google
    72.14.199.[0-9]{1,3} #Google
    8.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3} #Google
    8.6.(5[0-5]|4[89]).[0-9]{1,3} #Google
    198.5.(21[01]|20[89]).[0-9]{1,3} #Infoseek
    205.226.20[0-7].[0-9]{1,3} #Infoseek
    66.196.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9]).[0-9]{1,3} #Inktomi
    68.142.(25[0-5]|2[0-4][0-9]|19[2-9]).[0-9]{1,3} #Inktomi
    74.6.[0-9]{1,3}.[0-9]{1,3} #Inktomi
    209.202.(25[0-5]|2[0-4][0-9]|19[2-9]).[0-9]{1,3} #Lycos
    131.107.[0-9]{1,3}.[0-9]{1,3} #Microsoft
    131.253.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]).[0-9]{1,3} #Microsoft
    131.253.2[4-7].[0-9]{1,3} #Microsoft
    131.253.4[67].[0-9]{1,3} #Microsoft
    131.253.61.[0-9]{1,3} #Microsoft
    131.253.6[23].[0-9]{1,3} #Microsoft
    131.253.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9]).[0-9]{1,3} #Microsoft
    157.5[45].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    157.55.109.[0-9]{1,3} #Microsoft
    157.55.110.4[0-7] #Microsoft
    157.55.110.(6[0-3]|5[0-9]|4[89]) #Microsoft
    157.55.1[67].[0-9]{1,3} #Microsoft
    157.55.18.[0-9]{1,3} #Microsoft
    157.55.3[2-5].[0-9]{1,3} #Microsoft
    157.55.36.[0-9]{1,3} #Microsoft
    157.55.39.[0-9]{1,3} #Microsoft
    157.55.48.[0-9]{1,3} #Microsoft
    157.5[6-9].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    157.56.229.[0-9]{1,3} #Microsoft
    157.56.92.[0-9]{1,3} #Microsoft
    157.56.93.[0-9]{1,3} #Microsoft
    157.56.9[45].[0-9]{1,3} #Microsoft
    157.60.[0-9]{1,3}.[0-9]{1,3} #Microsoft
    199.30.(3[01]|2[0-9]|1[6-9]).[0-9]{1,3} #Microsoft
    199.30.16.[0-9]{1,3} #Microsoft
    199.30.27.[0-9]{1,3} #Microsoft
    202.96.51.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]) #Microsoft
    204.95.(11[01]|10[0-9]|9[6-9]).[0-9]{1,3} #Microsoft
    207.46.[0-9]{1,3}.[0-9]{1,3} #Microsoft
    207.46.0.[0-9]{1,3} #Microsoft
    207.46.1[23].[0-9]{1,3} #Microsoft
    207.46.192.[0-9]{1,3} #Microsoft
    207.46.195.[0-9]{1,3} #Microsoft
    207.46.199.[0-9]{1,3} #Microsoft
    207.46.204.[0-9]{1,3} #Microsoft
    207.68.(19[01]|1[3-8][0-9]|12[89]).[0-9]{1,3} #Microsoft
    208.68.(14[0-3]|13[6-9]).[0-9]{1,3} #Microsoft
    213.199.(14[0-3]|13[0-9]|12[89]).[0-9]{1,3} #Microsoft
    219.142.53.(12[0-7]|1[01][0-9]|[1-9]?[0-9]) #Microsoft
    40.11[2-9].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.12[0-3].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.124.[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.125.(12[0-7]|1[01][0-9]|[1-9]?[0-9]).[0-9]{1,3} #Microsoft
    40.7[45].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.7[6-9].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.77.167.[0-9]{1,3} #Microsoft
    40.(9[0-5]|8[0-9]).[0-9]{1,3}.[0-9]{1,3} #Microsoft
    40.(11[01]|10[0-9]|9[6-9]).[0-9]{1,3}.[0-9]{1,3} #Microsoft
    64.4.(6[0-3]|[1-5]?[0-9]).[0-9]{1,3} #Microsoft
    65.5[2-5].[0-9]{1,3}.[0-9]{1,3} #Microsoft
    65.52.104.[0-9]{1,3} #Microsoft
    65.52.(11[01]|10[89]).[0-9]{1,3} #Microsoft
    65.55.213.[0-9]{1,3} #Microsoft
    65.55.217.[0-9]{1,3} #Microsoft
    65.55.24.[0-9]{1,3} #Microsoft
    65.55.52.[0-9]{1,3} #Microsoft
    65.55.55.[0-9]{1,3} #Microsoft
    114.111.95.[0-9]{1,3} #Yahoo
    124.83.159.[0-9]{1,3} #Yahoo
    124.83.179.[0-9]{1,3} #Yahoo
    124.83.223.[0-9]{1,3} #Yahoo
    183.79.[0-9]{1,3}.[0-9]{1,3} #Yahoo
    183.79.63.[0-9]{1,3} #Yahoo
    183.79.92.[0-9]{1,3} #Yahoo
    202.160.(19[01]|18[0-9]|17[6-9]).[0-9]{1,3} #Yahoo
    202.165.(11[01]|10[0-9]|9[6-9]).[0-9]{1,3} #Yahoo
    202.46.19.[0-9]{1,3} #Yahoo
    203.141.(4[0-7]|3[2-9]).[0-9]{1,3} #Yahoo
    203.216.255.[0-9]{1,3} #Yahoo
    206.190.(6[0-3]|[45][0-9]|3[2-9]).[0-9]{1,3} #Yahoo
    206.3.(3[01]|[12]?[0-9]).[0-9]{1,3} #Yahoo
    207.126.(23[0-9]|22[4-9]).[0-9]{1,3} #Yahoo
    209.131.(6[0-3]|[45][0-9]|3[2-9]).[0-9]{1,3} #Yahoo
    209.191.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9]).[0-9]{1,3} #Yahoo
    210.236.233.[0-9]{1,3} #Yahoo
    211.13.230.[0-9]{1,3} #Yahoo
    211.14.11.[0-9]{1,3} #Yahoo
    211.14.8.[0-9]{1,3} #Yahoo
    216.109.(12[0-7]|11[2-9]).[0-9]{1,3} #Yahoo
    216.136.23[2-5].[0-9]{1,3} #Yahoo
    216.145.(6[0-3]|5[0-9]|4[89]).[0-9]{1,3} #Yahoo
    216.155.(20[0-7]|19[2-9]).[0-9]{1,3} #Yahoo
    63.163.102.[0-9]{1,3} #Yahoo
    64.157.13[6-9].[0-9]{1,3} #Yahoo
    66.163.(19[01]|1[6-8][0-9]).[0-9]{1,3} #Yahoo
    66.196.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9]).[0-9]{1,3} #Yahoo
    66.218.(9[0-5]|[78][0-9]|6[4-9]).[0-9]{1,3} #Yahoo
    66.228.(19[01]|1[6-8][0-9]).[0-9]{1,3} #Yahoo
    66.94.(25[0-5]|2[34][0-9]|22[4-9]).[0-9]{1,3} #Yahoo
    67.195.[0-9]{1,3}.[0-9]{1,3} #Yahoo
    67.195.110.[0-9]{1,3} #Yahoo
    67.195.111.[0-9]{1,3} #Yahoo
    67.195.11[23].[0-9]{1,3} #Yahoo
    67.195.114.[0-9]{1,3} #Yahoo
    67.195.115.[0-9]{1,3} #Yahoo
    67.195.37.[0-9]{1,3} #Yahoo
    67.195.50.[0-9]{1,3} #Yahoo
    68.142.(25[0-5]|2[0-4][0-9]|19[2-9]).[0-9]{1,3} #Yahoo
    68.180.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]).[0-9]{1,3} #Yahoo
    68.180.(23[01]|22[4-9]).[0-9]{1,3} #Yahoo
    69.147.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9]).[0-9]{1,3} #Yahoo
    72.30.[0-9]{1,3}.[0-9]{1,3} #Yahoo
    72.30.132.[0-9]{1,3} #Yahoo
    72.30.142.[0-9]{1,3} #Yahoo
    72.30.161.[0-9]{1,3} #Yahoo
    72.30.196.[0-9]{1,3} #Yahoo
    72.30.198.[0-9]{1,3} #Yahoo
    74.6.[0-9]{1,3}.[0-9]{1,3} #Yahoo
    74.6.13.[0-9]{1,3} #Yahoo
    74.6.17.[0-9]{1,3} #Yahoo
    74.6.18.[0-9]{1,3} #Yahoo
    74.6.22.[0-9]{1,3} #Yahoo
    74.6.254.[0-9]{1,3} #Yahoo
    74.6.27.[0-9]{1,3} #Yahoo
    74.6.8.[0-9]{1,3} #Yahoo
    98.13[6-9].[0-9]{1,3}.[0-9]{1,3} #Yahoo
    98.137.206.[0-9]{1,3} #Yahoo
    98.137.207.[0-9]{1,3} #Yahoo
    98.137.72.[0-9]{1,3} #Yahoo
    98.139.168.[0-9]{1,3} #Yahoo
    100.43.(9[0-5]|[78][0-9]|6[4-9]).[0-9]{1,3} #Yandex
    100.43.80.[0-9]{1,3} #Yandex
    100.43.81.[0-9]{1,3} #Yandex
    100.43.85.[0-9]{1,3} #Yandex
    100.43.90.[0-9]{1,3} #Yandex
    100.43.91.[0-9]{1,3} #Yandex
    130.193.62.[0-9]{1,3} #Yandex
    141.8.143.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]) #Yandex
    141.8.153.[0-9]{1,3} #Yandex
    141.8.153.(12[0-7]|11[2-9]) #Yandex
    178.154.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]).[0-9]{1,3} #Yandex
    178.154.165.[0-9]{1,3} #Yandex
    178.154.166.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]) #Yandex
    178.154.200.[0-9]{1,3} #Yandex
    178.154.202.[0-9]{1,3} #Yandex
    178.154.205.[0-9]{1,3} #Yandex
    178.154.21[01].[0-9]{1,3} #Yandex
    178.154.239.[0-9]{1,3} #Yandex
    178.154.243.[0-9]{1,3} #Yandex
    199.21.9[6-9].[0-9]{1,3} #Yandex
    199.21.99.[0-9]{1,3} #Yandex
    213.180.(22[0-3]|2[01][0-9]|19[2-9]).[0-9]{1,3} #Yandex
    213.180.(20[0-7]|19[2-9]).[0-9]{1,3} #Yandex
    213.180.(21[0-5]|20[89]).[0-9]{1,3} #Yandex
    213.180.21[6-9].[0-9]{1,3} #Yandex
    37.140.141.[0-9]{1,3} #Yandex
    37.140.165.[0-9]{1,3} #Yandex
    37.140.188.[0-9]{1,3} #Yandex
    37.9.115.[0-9]{1,3} #Yandex
    37.9.8[4-7].[0-9]{1,3} #Yandex
    5.255.253.[0-9]{1,3} #Yandex
    5.45.254.[0-9]{1,3} #Yandex
    77.88.(6[0-3]|[1-5]?[0-9]).[0-9]{1,3} #Yandex
    77.88.22.(12[0-7]|1[01][0-9]|[1-9]?[0-9]) #Yandex
    77.88.29.[0-9]{1,3} #Yandex
    77.88.31.[0-9]{1,3} #Yandex
    77.88.59.[0-9]{1,3} #Yandex
    84.201.146.[0-9]{1,3} #Yandex
    84.201.148.[0-9]{1,3} #Yandex
    84.201.149.[0-9]{1,3} #Yandex
    87.250.(25[0-5]|2[34][0-9]|22[4-9]).[0-9]{1,3} #Yandex
    87.250.243.[0-9]{1,3} #Yandex
    87.250.253.[0-9]{1,3} #Yandex
    93.158.147.[0-9]{1,3} #Yandex
    93.158.148.[0-9]{1,3} #Yandex
    93.158.151.[0-9]{1,3} #Yandex
    93.158.153.[0-9]{1,3} #Yandex
    95.108.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]).[0-9]{1,3} #Yandex
    95.108.128.[0-9]{1,3} #Yandex
    95.108.138.[0-9]{1,3} #Yandex
    95.108.15[01].[0-9]{1,3} #Yandex
    95.108.156.[0-9]{1,3} #Yandex
    95.108.158.[0-9]{1,3} #Yandex
    95.108.188.(25[0-5]|2[0-4][0-9]|1[3-9][0-9]|12[89]) #Yandex
    95.108.234.[0-9]{1,3} #Yandex
    95.108.24[67].[0-9]{1,3} #Yandex
    95.108.248.[0-9]{1,3} #Yandex
    95.158.(19[01]|1[3-8][0-9]|12[89]).[0-9]{1,3} #Yandex
    
     
    • Thanks Thanks x 1
  2. cherub

    cherub Regular Member

    Joined:
    Dec 18, 2006
    Messages:
    284
    Likes Received:
    123
    Gender:
    Male
    Occupation:
    Boss
    Location:
    UK
    Nice list, though is Altavista still a thing? Also, not sure about blocking users not giving a referer, as many antivirus/firewall applications strip referrers, and the people using such software tend to be the ones who will buy crap on the internet ;)
     
  3. carlosn

    carlosn Newbie

    Joined:
    Jan 28, 2011
    Messages:
    44
    Likes Received:
    7
    No, Altavista is not a big thing anymore. However, the IP shows that the organization that owns it is Altavista, and as you may now Altavista has been owned by Yahoo for a long time.

    I use Avast antivirus, Zone Alarm firewall, and I have no problems blocking users with no referrer. Usually scrapers don't have a referrer, because most of the time scrapers have a list of sites, and visit the sites directly without going thru a SE, and even though I scrape :) I don't want my sites scraped.

    Cheers
     
  4. Cedric Heyward

    Cedric Heyward BANNED BANNED

    Joined:
    Jan 21, 2016
    Messages:
    125
    Likes Received:
    31
    But this is only good for SEO/Organic traffic.

    You can use Simple Search Engine Suite for this, it is recommended for SEO traffic.
     
  5. carlosn

    carlosn Newbie

    Joined:
    Jan 28, 2011
    Messages:
    44
    Likes Received:
    7
    You are right Cedric It is only good for organic traffic. For me the most important traffic because people use more search engines for purchases than social media. Social media is good for liking product pages, but at the end consumers go to search engines to find the best deal on that recommended product on Facebook. This article clearly states tha search engines have more traffic than social media even though Facebook has over one billion users who browse friends pictures, and laugh at memes, and videos.:

    http://searchengineland.com/study-organic-search-drives-51-traffic-social-5-202063

    The above is why I prefer to use SEC and Scrapebox to get incoming links to my cloaked pages. Instead of paying 40 up dollars for a Java Script redirect with Simplified Search Engine Suite.
     
  6. carlhardy

    carlhardy Junior Member

    Joined:
    Mar 18, 2015
    Messages:
    132
    Likes Received:
    9
    nice concept really good thanks for sharing this...
     
  7. Crazy Monkey

    Crazy Monkey Jr. VIP Jr. VIP

    Joined:
    Aug 4, 2015
    Messages:
    1,986
    Likes Received:
    246
    Gender:
    Male
    Location:
    In Jungle
    Nice buddy, Good work Thanks for sharing your information
     
  8. deuteros

    deuteros Newbie

    Joined:
    Jan 27, 2016
    Messages:
    1
    Likes Received:
    0
    Sorry for the stupid question, but what's the advantage of using this list?

    Why would I want to block all these spiders?
     
  9. carlosn

    carlosn Newbie

    Joined:
    Jan 28, 2011
    Messages:
    44
    Likes Received:
    7
    Two reasons:

    First: Search Engines like Google might camouflage using another spider name to craw your site, and check if you are delivering different content to the bot, and to users. So, blocking these known bad crawlers, and blocking empty referrers and/or empty user agents you block camouflaged search engines. Of course, search engines could always run a visual audit with a real human, but that happens seldom, or when someone tells on you.

    Second: Avoid your good hard to create content (real content you show to visitors, not cloaked pages) being scraped and used on another site.

    Here a code that actually blocks scrapers that don't obey robots.txt. It adds bad bots IPs to your htaccess file automatically - MORE INFO BELOW THIS CODE

    Code:
    <?php
    // * Works with PHP 4.* + & 5.*
    
    // * DISCLAIMER: Author of this script is not and will not be held liable and or responsibile for any configuration errors, loss of data, interruption of *
    // * service, and or any other means of misuse by neglect pertaining to this script that affects any internet account. User of this script takes sole *
    // * responsibility for any resulting problems that arise due to usage of this script in part or in whole and shall hold indemnity against the author. *
    
    $ipad = $_SERVER['REMOTE_ADDR'];    // Collects the user/visitor IP address.
    $ban = "deny from $ipad\n";            // What will be written to the .htaccess file if IP needs to be banned.
    $file = ".htaccess";                    // Change to -> .htaccess <- once thoroughly tested. Should be placed in the root directory.
    $search = file_get_contents($file);        // Prepare the .htaccess file by gathering entire contents.
    $check = strpos($search, $ipad);        // Checks the .htaccess file if the current user IP address string does exist.
    
    // This next part of the script checks to see if the IP is already banned or not.
    // If the IP does not already exist; it will write the ban line to the .htaccess file, display the message, and then email you a copy.
    // If the IP is already listed in the .htaccess file; the script ends with only a displayed message.
    if ($check === FALSE) {
    
    $open = @fopen($file, "a");            // Open the .htaccess file and get ready for writing only.
    $write = @fputs($open, $ban);        // Write the banned IP line to the .htaccess file. (Example: Deny from 12.34.56.789)
    
    // Email a copy of ban and info to your admin account (or other email address).
    // Make sure you change the email address.
    @mail('[email protected]','Banned IP at My Site Name '.$_SERVER['REMOTE_ADDR'].'','
    Banned IP: '.$_SERVER['REMOTE_ADDR'].'
    Request URI: '.$_SERVER['REQUEST_URI'].'
    User Agent: '.$_SERVER['HTTP_USER_AGENT']);
    
    // IP address is not banned - so there is a need to write to .htaccess file.
    // Display the error message to the user. (You may change to read what you want).
        echo '<html><head><title>IP Address '.$ipad.' - Blocked or Banned!</title></head><body bgcolor="#FF000000" text="#FFFFFF" oncontextmenu="return false;"><center><font face="Verdana, Arial"><h1>THANK YOU - DON\'T COME AGAIN!</h1><b>IP Address '.$ipad.' Has Been Blocked or Banned!<br />Contact the web admin if this ban is by mistake.<p />Have a nice day!</b></font></center></body></html>';
    
    // Close the .htaccess file - all done.
    @fclose($open);
    } else {
    
    // IP address is already banned - no need to rewrite to .htaccess file.
    // Display the error message to the user. (You may change to read what you want).
        echo '<html><head><title>IP Address '.$ipad.' - Blocked or Banned!</title></head><body bgcolor="#FF000000" text="#FFFFFF" oncontextmenu="return false;"><center><font face="Verdana, Arial"><h1>THANK YOU - DON\'T COME AGAIN!</h1><b>IP Address '.$ipad.' Has Been Blocked or Banned!<br />Contact the web admin if this ban is by mistake.<p />Have a nice day!</b></font></center></body></html>';
    }
    
    // End of File/Script;
    exit;
    ?>
    
    -Create a file, lets call it comehere.php and add the code above.
    -Add a link to comehere.php to your homepage as a hidden link/image that visitors won't see, only spiders.
    -Add to robots.txt a disallow to that specific page:

    Code:
    User-agent: *
    Disallow: /comehere.php
    
    Known search engines (Google, Yahoo, Bing) won't visit/crawl that page, because they follow rules in robots.txt.
    Scrapers will crawl comehere.php, and be blocked automatically by adding their IP to your main htaccess file.

    Test it yourself by clicking on your comehere.php link that you added to your homepage, and access will be forbidden to your own site. Go to your htaccess file, and delete your IP after testing.

    Cheers!
     
  10. Furious Man

    Furious Man Jr. VIP Jr. VIP

    Joined:
    Aug 4, 2015
    Messages:
    1,577
    Likes Received:
    242
    really good one buddy nice share mate keep it up