1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Complete Spiders, Crawlers, And Bots Control Script - Redirect Where You Want

Discussion in 'Black Hat SEO' started by hellohellosharp, Jul 28, 2011.

  1. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    The purpose of this script is to redirect all bots, crawlers, and spiders to one location while redirecting all users to a different location. The bots where pulled off a user-agent list and I also added some of my own.

    How Can You Use This?

    This can be used several different ways. Check out showboytridin's Method To Get One-Way Backlinks. This script works basically the same way except it includes almost every bot out there.

    Why Did I Make It

    The idea actually came to me when using AMR. I like to point my links to a PHP file so I can control the link juice however and whenever I want (just edit the PHP file). The thought also came to me that people use AMR for link juice, but what about the actual views? What if I could redirect users to an affiliate link while keeping my link juice?

    So I made an AMR submission that pointed to a PHP file and logged all the visitors (the bots that crawl the link). I put them in a list and then added more bots from a database.

    Then I implemented this into a link file and there we have it, a complete redirect script.


    How to use

    Make a new directory on your site. Lets say its called "recommend" (so http://yourdomain.com/recommend)

    Make three files in the directory.
    • link.php <---this can really be called anything you want
    • bot_detection.php <---must be named that exactly
    • log.txt <---also must be named that exactly

    Here is the content of the files:

    bot_detection.php
    PHP:
    <?php

    $bot_list 
    = array("Teoma""alexa""froogle""Gigabot""inktomi",
    "looksmart""URL_Spider_SQL""Firefly""NationalDirectory",
    "Ask Jeeves""TECNOSEEK""InfoSeek""WebFindBot""girafabot",
    "crawler""www.galaxy.com""Googlebot""Scooter""Slurp",
    "msnbot""appie""FAST""WebBug""Spade""ZyBorg""rabaz",
    "Baiduspider""Feedfetcher-Google""TechnoratiSnoop""Rankivabot",
    "Mediapartners-Google""Sogou web spider""WordPress""WordPress/3.1;""WordPress/3.1""WebAlta Crawler",
    "ABCdatos BotLink""Acme.Spider""Ahoy! The Homepage Finder""Alkaline""Anthill""Walhello appie""Arachnophilia""Arale""Araneo""AraybOt""ArchitextSpider""Aretha""ARIADNE""arks""AskJeeves",
    "ATN Worldwide""Atomz.com Search Robot""AURESYS""BackRub""Bay Spider""Big Brother""Bjaaland""BlackWidow""Die Blinde Kuh""Bloodhound""Borg-Bot""BoxSeaBot""bright.net caching robot""BSpider",
    "CACTVS Chemistry Spider""Calif""Cassandra""Digimarc Marcspider/CGI""Checkbot""ChristCrawler.com""churl""cIeNcIaFiCcIoN.nEt""CMC/0.01""Collective""Combine System""Conceptbot""ConfuzzledBot""CoolBot""Web Core / Roots""XYLEME Robot""Internet Cruiser Robot""Cusco""CyberSpyder Link Test""CydralSpider""Desert Realm Spider""DeWeb(c) Katalog/Index""DienstSpider""Digger""Digital Integrity Robot""Direct Hit Grabber""DNAbot""DownLoad Express""DragonBot""DWCP (Dridus' Web Cataloging Project)""e-collector""EbiNess""EIT Link Verifier Robot""ELFINBOT""Emacs-w3 Search Engine""ananzi""esculapio""Esther""Evliya Celebi""FastCrawler""Fluid Dynamics Search Engine robot""Felix IDE""Wild Ferret Web Hopper #1, #2, #3""FetchRover""fido""Hmhkki""KIT-Fireball""Fish search""Fouineur""Robot Francoroute""Freecrawl""FunnelWeb""gammaSpider, FocusedCrawler""gazz""GCreep""GetBot""GetURL""Golem""Googlebot""Grapnel/0.01 Experiment""Griffon""Gromit""Northern Light Gulliver""Gulper Bot""HamBot""Harvest""havIndex""HI (HTML Index) Search""Hometown Spider Pro""ht://Dig""HTMLgobble""Hyper-Decontextualizer""iajaBot""IBM_Planetwide""Popular Iconoclast""Ingrid""Imagelock""IncyWincy""Informant""InfoSeek Robot 1.0""Infoseek Sidewinder""InfoSpiders""Inspector Web""IntelliAgent""I, Robot""Iron33""Israeli-search""JavaBee""JBot Java Web Robot""JCrawler""Jeeves""JoBo Java Web Robot""Jobot""JoeBot""The Jubii Indexing Robot""JumpStation""image.kapsi.net""Katipo""KDD-Explorer""Kilroy""KO_Yappo_Robot""LabelGrabber""larbin""legs""Link Validator""LinkScan""LinkWalker""Lockon""logo.gif Crawler""Lycos""Mac WWWWorm""Magpie""marvin/infoseek""Mattie""MediaFox""MerzScope""NEC-MeshExplorer""MindCrawler""mnoGoSearch search engine software""moget""MOMspider""Monster""Motor""MSNBot""Muncher""Muninn""Muscat Ferret""Mwd.Search""Internet Shinchakubin""NDSpider""Nederland.zoek""NetCarta WebMap Engine""NetMechanic""NetScoop""newscan-online""NHSE Web Forager""Nomad""The NorthStar Robot""nzexplorer""ObjectsSearch""Occam""HKU WWW Octopus""OntoSpider""Openfind data gatherer""Orb Search""Pack Rat""PageBoy""ParaSite""Patric""pegasus""The Peregrinator""PerlCrawler 1.0""Phantom""PhpDig""PiltdownMan""Pimptrain.com's robot""Pioneer""html_analyzer""Portal Juice Spider""PGP Key Agent""PlumtreeWebAccessor""Poppi""PortalB Spider""psbot""GetterroboPlus Puu""The Python Robot""Raven Search""RBSE Spider""Resume Robot""RoadHouse Crawling System""RixBot""Road Runner: The ImageScape Robot""Robbie the Robot""ComputingSite Robi/1.0""RoboCrawl Spider""RoboFox""Robozilla""Roverbot""RuLeS""SafetyNet Robot""Scooter""Sleek""Search.Aus-AU.COM""SearchProcess""Senrigan""SG-Scout""ShagSeeker""Shai'Hulud""Sift""Simmany Robot Ver1.0""Site Valet""Open Text Index Robot""SiteTech-Rover""Skymob.com""SLCrawler""Inktomi Slurp""Smart Spider""Snooper""Solbot""Spanner""Speedy Spider""spider_monkey""SpiderBot""Spiderline Crawler""SpiderMan""SpiderView(tm)""Spry Wizard Robot""Site Searcher""Suke""suntek search engine""Sven""Sygol""TACH Black Widow""Tarantula""tarspider""Tcl W3 Robot""TechBOT""Templeton""TeomaTechnologies""TITAN""TitIn""The TkWWW Robot""TLSpider""UCSD Crawl""UdmSearch""UptimeBot""URL Check""URL Spider Pro""Valkyrie""Verticrawl""Victoria""vision-search""void-bot""Voyager""VWbot""The NWI Robot""W3M2""WallPaper (alias crawlpaper)""the World Wide Web Wanderer""[email protected] by wap4.com""WebBandit Web Spider""WebCatcher""WebCopy""webfetcher""The Webfoot Robot""Webinator""weblayers""WebLinker""WebMirror""The Web Moose""WebQuest""Digimarc MarcSpider""WebReaper""webs""Websnarf""WebSpider""WebVac""webwalk""WebWalker""WebWatch""Wget""whatUseek Winona""WhoWhere Robot""Wired Digital""Weblog Monitor""w3mir""WebStolperer""The Web Wombat""The World Wide Web Worm""WWWC Ver 0.2.5""WebZinger""XGET""Jakarta Commons-HttpClient/3.0""Moreoverbot/5.1");

    //Note there are a few repeats in the list above. You can find and delete them if you deem it necessary.

    function detect_bot() { 
    global 
    $bot_list

    foreach(
    $bot_list as $bot) { 
    if(
    ereg($bot$_SERVER['HTTP_USER_AGENT'])) { 
    $thebot $bot



    if (
    $bot) { 
    return 
    $thebot


    ?>
    link.php

    PHP:
    <?php
    $my_link 
    "http://affiliate_link.com"// Edit this link to where the User's will be redirected.
    $bot_link "http://my_site.com"// Edit this link to where the Bot's will be redirected.


    include ("bot_detection.php");

    //Start log session. If you don't want to log, remove this section.
    $myFile "log.txt";
    $theDate date('l jS \of F Y h:i:s A');
    $fh fopen($myFile'a') or die("can't open file");
    if (
    detect_bot())
    $stringData $theDate " " $_SERVER['HTTP_USER_AGENT'] . " <<<This BOT was redirected to " $bot_link "\n\n";
    else
    $stringData $theDate " " $_SERVER['HTTP_USER_AGENT'] . " <<<This USER was redirected to " $my_link "\n\n";
    fwrite($fh$stringData);
    fclose($fh);
    //End log session. If you don't want to log, remove this section.

    if (detect_bot()) {
        
    header("HTTP/1.1 301 Moved Permanently");
        
    header("Location: $bot_link");
        exit;
    } else {
        
    header("Location: $my_link");
        exit;
    }



    ?>[/CENTER]
    And the log.txt you just leave blank. Then just edit the link.php for your links. The log.txt simply keeps track of your visitors (with date and time).

    Then just link to the link.php file in your articles and such. (http://yourdomain.com/recommend/link.php)

    In theory, 301 redirects pass the same link juice as just direct linking to the website. But there has always been an argument over whether that is true or not.

    The only other CON to this is that obviously you cant spin where the links point to. I guess you could make several link.php files (link1, link2, etc.) and then spin which one is pointed to. The choice is yours!

    Have fun with this, it will be interesting to see how people use it. I am open to any questions​
     
    • Thanks Thanks x 8
    Last edited: Jul 29, 2011
  2. fire_man

    fire_man Newbie

    Joined:
    Jul 28, 2011
    Messages:
    6
    Likes Received:
    0
    I dont understand what this does at all or how it can make me money...

    can anyone explain please?
     
  3. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    The script is all about control over your links. There are several applications for it.

    1. First, like I said, it can be used to direct users to affiliate links while still keeping your link juice.
    2. Second, lets say you have all your articles pointed to an affiliate link thru the PHP script. And then the vendor for the product shuts down. Instead of changing the link in each article one by one, you can just change the PHP in the script.
    3. Let's say you were using showboytridin's method, but the "reciprocal link" wasn't actually on that page. You could redirect the users to some page with a link to them floating out in cyberspace, it doesnt even have to be yours.

    These are just a few ideas, can be used several other ways.
     
    • Thanks Thanks x 1
  4. RightInTwo

    RightInTwo Power Member

    Joined:
    Feb 23, 2010
    Messages:
    744
    Likes Received:
    381
    Home Page:
    Wow i have a few ideas for this script, thanks for the great share!
     
  5. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    • Thanks Thanks x 1
  6. neu009

    neu009 Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 29, 2009
    Messages:
    1,052
    Likes Received:
    282
    well kind of like cloaking your site. but google often does not use an agent or fakes one so how would you want to keep that up...
    For affiliate I just track the links and conversions already.You could just change the link there and rotate offers.
    So basically it is about the 301 for the bots, how you think that will benefit ? So you try to show it as a backlink to the bots instead of the aff link... but problem again as mentioned above with unknowk agents or fake ones..
     
  7. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    The log.txt isn't really for tracking conversions, it's for finding new bots. It isn't very hard to take a real quick look at the log and find if a bot is being treated as a user, and then add that bot to bot_detection.php

    And I am pretty sure Google has always used a User-Agent? If not they would have to manually check each site with a normal browser, which is out of the question. If they used a fake/different one, as I said before, just add the fake to the bot_detection.
     
    • Thanks Thanks x 1
  8. muxmkt

    muxmkt Power Member

    Joined:
    May 24, 2011
    Messages:
    543
    Likes Received:
    132
    how to i add twitter bots?
    im a bit confuse of the usage of this, but i need to add those twitter bots :d
     
  9. neu009

    neu009 Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 29, 2009
    Messages:
    1,052
    Likes Received:
    282
    Well im not sure how hey do it but considering you can put any agent in you want you can cycle through all of them. and I should have been more precise : I meant that they do not always use google bot as agent but disguise it.

    good idea though.
     
  10. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    Hmmm I have never worked with Twitter before so I am unsure what exactly you mean....

    But if you mean Twitter crawl bots then just add a link to the PHP file and see what the log.txt outputs.

    You can use my live example. Do a twitter post to http://discountcoup.net/recommend/link.php/ then wait for the twitter bot to crawl it, then go to http://discountcoup.net/recommend/log.txt and see what it's output was.

    I will then add the twitter bot to the bot_detection.php
     
  11. muxmkt

    muxmkt Power Member

    Joined:
    May 24, 2011
    Messages:
    543
    Likes Received:
    132
    ill try this, thanks!

    edit: worked

    the amount of bots that crawl twitter links are ridiculous hah, btw the clbbz is my yourls shortener so it isnt a bot :p

    i had 12 bots hittin it in like 20 seconds
     
    Last edited: Jul 30, 2011
  12. Tertsim

    Tertsim Newbie

    Joined:
    Dec 11, 2009
    Messages:
    35
    Likes Received:
    2
    Cloaking gets you penalized.. even more easily these days.
     
  13. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    Alright, I won't add clbbz to the list but the only other one I saw was "TweetmemeBot" which I am adding to the list. Any other ones look like bots to you guys?
     
  14. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    EDIT: New Bot Detection script with TweetmemeBot. Thanks to muxmkt for finding.
    2nd EDIT: Just saw other bots muxmkt was talking about. Added all those as well. Added: "TweetmemeBot", "PostRank/", "PostRank/2.0", "PostRank/2", "Twitterbot/1.0", "Twitterbot/", "Twitterbot", "JS-Kit URL Resolver", "JS-Kit"
    PHP:
    <?php

    $bot_list 
    = array("Teoma""alexa""froogle""Gigabot""inktomi",
    "looksmart""URL_Spider_SQL""Firefly""NationalDirectory",
    "Ask Jeeves""TECNOSEEK""InfoSeek""WebFindBot""girafabot",
    "crawler""www.galaxy.com""Googlebot""Scooter""Slurp",
    "msnbot""appie""FAST""WebBug""Spade""ZyBorg""rabaz",
    "Baiduspider""Feedfetcher-Google""TechnoratiSnoop""Rankivabot",
    "Mediapartners-Google""Sogou web spider""WordPress""WordPress/3.1;""WordPress/3.1""WebAlta Crawler",
    "ABCdatos BotLink""Acme.Spider""Ahoy! The Homepage Finder""Alkaline""Anthill""Walhello appie""Arachnophilia""Arale""Araneo""AraybOt""ArchitextSpider""Aretha""ARIADNE""arks""AskJeeves",
    "ATN Worldwide""Atomz.com Search Robot""AURESYS""BackRub""Bay Spider""Big Brother""Bjaaland""BlackWidow""Die Blinde Kuh""Bloodhound""Borg-Bot""BoxSeaBot""bright.net caching robot""BSpider",
    "CACTVS Chemistry Spider""Calif""Cassandra""Digimarc Marcspider/CGI""Checkbot""ChristCrawler.com""churl""cIeNcIaFiCcIoN.nEt""CMC/0.01""Collective""Combine System""Conceptbot""ConfuzzledBot""CoolBot""Web Core / Roots""XYLEME Robot""Internet Cruiser Robot""Cusco""CyberSpyder Link Test""CydralSpider""Desert Realm Spider""DeWeb(c) Katalog/Index""DienstSpider""Digger""Digital Integrity Robot""Direct Hit Grabber""DNAbot""DownLoad Express""DragonBot""DWCP (Dridus' Web Cataloging Project)""e-collector""EbiNess""EIT Link Verifier Robot""ELFINBOT""Emacs-w3 Search Engine""ananzi""esculapio""Esther""Evliya Celebi""FastCrawler""Fluid Dynamics Search Engine robot""Felix IDE""Wild Ferret Web Hopper #1, #2, #3""FetchRover""fido""Hämähäkki""KIT-Fireball""Fish search""Fouineur""Robot Francoroute""Freecrawl""FunnelWeb""gammaSpider, FocusedCrawler""gazz""GCreep""GetBot""GetURL""Golem""Googlebot""Grapnel/0.01 Experiment""Griffon""Gromit""Northern Light Gulliver""Gulper Bot""HamBot""Harvest""havIndex""HI (HTML Index) Search""Hometown Spider Pro""ht://Dig""HTMLgobble""Hyper-Decontextualizer""iajaBot""IBM_Planetwide""Popular Iconoclast""Ingrid""Imagelock""IncyWincy""Informant""InfoSeek Robot 1.0""Infoseek Sidewinder""InfoSpiders""Inspector Web""IntelliAgent""I, Robot""Iron33""Israeli-search""JavaBee""JBot Java Web Robot""JCrawler""Jeeves""JoBo Java Web Robot""Jobot""JoeBot""The Jubii Indexing Robot""JumpStation""image.kapsi.net""Katipo",
    "KDD-Explorer""Kilroy""KO_Yappo_Robot""LabelGrabber""larbin""legs""Link Validator""LinkScan""LinkWalker""Lockon""logo.gif Crawler""Lycos""Mac WWWWorm""Magpie""marvin/infoseek""Mattie""MediaFox""MerzScope""NEC-MeshExplorer""MindCrawler""mnoGoSearch search engine software""moget""MOMspider""Monster""Motor""MSNBot""Muncher""Muninn""Muscat Ferret""Mwd.Search""Internet Shinchakubin""NDSpider""Nederland.zoek""NetCarta WebMap Engine""NetMechanic""NetScoop""newscan-online""NHSE Web Forager""Nomad""The NorthStar Robot""nzexplorer""ObjectsSearch""Occam""HKU WWW Octopus""OntoSpider""Openfind data gatherer""Orb Search""Pack Rat""PageBoy""ParaSite""Patric""pegasus""The Peregrinator""PerlCrawler 1.0""Phantom""PhpDig""PiltdownMan""Pimptrain.com's robot""Pioneer""html_analyzer""Portal Juice Spider""PGP Key Agent""PlumtreeWebAccessor""Poppi""PortalB Spider""psbot""GetterroboPlus Puu""The Python Robot""Raven Search""RBSE Spider""Resume Robot""RoadHouse Crawling System""RixBot""Road Runner: The ImageScape Robot""Robbie the Robot""ComputingSite Robi/1.0""RoboCrawl Spider""RoboFox""Robozilla""Roverbot""RuLeS""SafetyNet Robot""Scooter""Sleek""Search.Aus-AU.COM""SearchProcess""Senrigan""SG-Scout""ShagSeeker""Shai'Hulud""Sift""Simmany Robot Ver1.0""Site Valet""Open Text Index Robot""SiteTech-Rover""Skymob.com""SLCrawler""Inktomi Slurp""Smart Spider""Snooper""Solbot""Spanner""Speedy Spider""spider_monkey""SpiderBot""Spiderline Crawler""SpiderMan""SpiderView(tm)""Spry Wizard Robot""Site Searcher""Suke""suntek search engine""Sven""Sygol""TACH Black Widow""Tarantula""tarspider""Tcl W3 Robot""TechBOT""Templeton""TeomaTechnologies""TITAN""TitIn""The TkWWW Robot""TLSpider""UCSD Crawl""UdmSearch""UptimeBot""URL Check""URL Spider Pro""Valkyrie""Verticrawl""Victoria""vision-search""void-bot""Voyager""VWbot""The NWI Robot""W3M2""WallPaper (alias crawlpaper)""the World Wide Web Wanderer""[email protected] by wap4.com""WebBandit Web Spider""WebCatcher""WebCopy""webfetcher""The Webfoot Robot""Webinator""weblayers""WebLinker""WebMirror""The Web Moose""WebQuest""Digimarc MarcSpider""WebReaper""webs""Websnarf""WebSpider""WebVac""webwalk""WebWalker""WebWatch""Wget""whatUseek Winona""WhoWhere Robot""Wired Digital""Weblog Monitor""w3mir""WebStolperer""The Web Wombat""The World Wide Web Worm""WWWC Ver 0.2.5""WebZinger""XGET""TweetmemeBot""PostRank/""PostRank/2.0""PostRank/2""Twitterbot/1.0""Twitterbot/""Twitterbot""JS-Kit URL Resolver""JS-Kit");

    //Note there are a few repeats in the list above. You can find and delete them if you deem it necessary.

    function detect_bot() { 
    global 
    $bot_list

    foreach(
    $bot_list as $bot) { 
    if(
    ereg($bot$_SERVER['HTTP_USER_AGENT'])) { 
    $thebot $bot



    if (
    $bot) { 
    return 
    $thebot


    ?>
    Keep the list growing!
     
    Last edited: Jul 30, 2011
  15. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    Added Birubot/1.0 And Birubot. Also uploaded php files so you can download if that's easier.

    PHP:
    <?php

    $bot_list 
    = array("Teoma""alexa""froogle""Gigabot""inktomi",
    "looksmart""URL_Spider_SQL""Firefly""NationalDirectory",
    "Ask Jeeves""TECNOSEEK""InfoSeek""WebFindBot""girafabot",
    "crawler""www.galaxy.com""Googlebot""Scooter""Slurp",
    "msnbot""appie""FAST""WebBug""Spade""ZyBorg""rabaz",
    "Baiduspider""Feedfetcher-Google""TechnoratiSnoop""Rankivabot",
    "Mediapartners-Google""Sogou web spider""WordPress""WordPress/3.1;""WordPress/3.1""WebAlta Crawler",
    "ABCdatos BotLink""Acme.Spider""Ahoy! The Homepage Finder""Alkaline""Anthill""Walhello appie""Arachnophilia""Arale""Araneo""AraybOt""ArchitextSpider""Aretha""ARIADNE""arks""AskJeeves",
    "ATN Worldwide""Atomz.com Search Robot""AURESYS""BackRub""Bay Spider""Big Brother""Bjaaland""BlackWidow""Die Blinde Kuh""Bloodhound""Borg-Bot""BoxSeaBot""bright.net caching robot""BSpider",
    "CACTVS Chemistry Spider""Calif""Cassandra""Digimarc Marcspider/CGI""Checkbot""ChristCrawler.com""churl""cIeNcIaFiCcIoN.nEt""CMC/0.01""Collective""Combine System""Conceptbot""ConfuzzledBot""CoolBot""Web Core / Roots""XYLEME Robot""Internet Cruiser Robot""Cusco""CyberSpyder Link Test""CydralSpider""Desert Realm Spider""DeWeb(c) Katalog/Index""DienstSpider""Digger""Digital Integrity Robot""Direct Hit Grabber""DNAbot""DownLoad Express""DragonBot""DWCP (Dridus' Web Cataloging Project)""e-collector""EbiNess""EIT Link Verifier Robot""ELFINBOT""Emacs-w3 Search Engine""ananzi""esculapio""Esther""Evliya Celebi""FastCrawler""Fluid Dynamics Search Engine robot""Felix IDE""Wild Ferret Web Hopper #1, #2, #3""FetchRover""fido""Hämähäkki""KIT-Fireball""Fish search""Fouineur""Robot Francoroute""Freecrawl""FunnelWeb""gammaSpider, FocusedCrawler""gazz""GCreep""GetBot""GetURL""Golem""Googlebot""Grapnel/0.01 Experiment""Griffon""Gromit""Northern Light Gulliver""Gulper Bot""HamBot""Harvest""havIndex""HI (HTML Index) Search""Hometown Spider Pro""ht://Dig""HTMLgobble""Hyper-Decontextualizer""iajaBot""IBM_Planetwide""Popular Iconoclast""Ingrid""Imagelock""IncyWincy""Informant""InfoSeek Robot 1.0""Infoseek Sidewinder""InfoSpiders""Inspector Web""IntelliAgent""I, Robot""Iron33""Israeli-search""JavaBee""JBot Java Web Robot""JCrawler""Jeeves""JoBo Java Web Robot""Jobot""JoeBot""The Jubii Indexing Robot""JumpStation""image.kapsi.net""Katipo",
    "KDD-Explorer""Kilroy""KO_Yappo_Robot""LabelGrabber""larbin""legs""Link Validator""LinkScan""LinkWalker""Lockon""logo.gif Crawler""Lycos""Mac WWWWorm""Magpie""marvin/infoseek""Mattie""MediaFox""MerzScope""NEC-MeshExplorer""MindCrawler""mnoGoSearch search engine software""moget""MOMspider""Monster""Motor""MSNBot""Muncher""Muninn""Muscat Ferret""Mwd.Search""Internet Shinchakubin""NDSpider""Nederland.zoek""NetCarta WebMap Engine""NetMechanic""NetScoop""newscan-online""NHSE Web Forager""Nomad""The NorthStar Robot""nzexplorer""ObjectsSearch""Occam""HKU WWW Octopus""OntoSpider""Openfind data gatherer""Orb Search""Pack Rat""PageBoy""ParaSite""Patric""pegasus""The Peregrinator""PerlCrawler 1.0""Phantom""PhpDig""PiltdownMan""Pimptrain.com's robot""Pioneer""html_analyzer""Portal Juice Spider""PGP Key Agent""PlumtreeWebAccessor""Poppi""PortalB Spider""psbot""GetterroboPlus Puu""The Python Robot""Raven Search""RBSE Spider""Resume Robot""RoadHouse Crawling System""RixBot""Road Runner: The ImageScape Robot""Robbie the Robot""ComputingSite Robi/1.0""RoboCrawl Spider""RoboFox""Robozilla""Roverbot""RuLeS""SafetyNet Robot""Scooter""Sleek""Search.Aus-AU.COM""SearchProcess""Senrigan""SG-Scout""ShagSeeker""Shai'Hulud""Sift""Simmany Robot Ver1.0""Site Valet""Open Text Index Robot""SiteTech-Rover""Skymob.com""SLCrawler""Inktomi Slurp""Smart Spider""Snooper""Solbot""Spanner""Speedy Spider""spider_monkey""SpiderBot""Spiderline Crawler""SpiderMan""SpiderView(tm)""Spry Wizard Robot""Site Searcher""Suke""suntek search engine""Sven""Sygol""TACH Black Widow""Tarantula""tarspider""Tcl W3 Robot""TechBOT""Templeton""TeomaTechnologies""TITAN""TitIn""The TkWWW Robot""TLSpider""UCSD Crawl""UdmSearch""UptimeBot""URL Check""URL Spider Pro""Valkyrie""Verticrawl""Victoria""vision-search""void-bot""Voyager""VWbot""The NWI Robot""W3M2""WallPaper (alias crawlpaper)""the World Wide Web Wanderer""[email protected] by wap4.com""WebBandit Web Spider""WebCatcher""WebCopy""webfetcher""The Webfoot Robot""Webinator""weblayers""WebLinker""WebMirror""The Web Moose""WebQuest""Digimarc MarcSpider""WebReaper""webs""Websnarf""WebSpider""WebVac""webwalk""WebWalker""WebWatch""Wget""whatUseek Winona""WhoWhere Robot""Wired Digital""Weblog Monitor""w3mir""WebStolperer""The Web Wombat""The World Wide Web Worm""WWWC Ver 0.2.5""WebZinger""XGET""TweetmemeBot""PostRank/""PostRank/2.0""PostRank/2""Twitterbot/1.0""Twitterbot/""Twitterbot""JS-Kit URL Resolver""JS-Kit""Birubot/1.0""Birubot/");

    //Note there are a few repeats in the list above. You can find and delete them if you deem it necessary.

    function detect_bot() { 
    global 
    $bot_list

    foreach(
    $bot_list as $bot) { 
    if(
    ereg($bot$_SERVER['HTTP_USER_AGENT'])) { 
    $thebot $bot



    if (
    $bot) { 
    return 
    $thebot


    ?>
     

    Attached Files:

    • Thanks Thanks x 1
  16. richardblock

    richardblock BANNED BANNED

    Joined:
    Dec 29, 2010
    Messages:
    113
    Likes Received:
    255
    These are some really nice scripts! Im surprised you are being so generous releasing a lot of them to the BHW community.

    Im gonna give a test run and see if they work well for one of my projects. Thanks+ rep given
     
    • Thanks Thanks x 1
  17. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    Thanks man:) Always appreciated
     
  18. fire_man

    fire_man Newbie

    Joined:
    Jul 28, 2011
    Messages:
    6
    Likes Received:
    0
    What does cloaking mean? is this script cloaking?
     
  19. Dr_Scythe

    Dr_Scythe Regular Member

    Joined:
    Jul 4, 2011
    Messages:
    281
    Likes Received:
    210
    Location:
    Australia
    Awesome share :)

    Gonna test this out on my next project
     
  20. hellohellosharp

    hellohellosharp Power Member

    Joined:
    Dec 8, 2010
    Messages:
    625
    Likes Received:
    552
    Occupation:
    CEO @ CLEANFILES LLC
    Location:
    USA
    Home Page:
    This script can be used for cloaking, although it isn't limited to it.

    There's two different types of cloaking, link cloaking and page cloaking.

    Link Cloaking: Some people think that affiliate and referral links hurt their rankings. Instead of direct affiliate linking, they link to a redirect page.

    Page Cloaking: This is where you have an entire blog that looks like a normal website to Google and search engines, but in reality redirects users to an affilate page right when clicked on.

    To use my script as a link cloaker, just use it how I outlined above. Link the search engines to a normal website (just do your homepage) and the users to the affiliate link.

    To use my script as a page cloaker, just add the following to the top of your PHP files:

    PHP:
    <?php
    $bot_link 
    "http://my_site.com"// This time it will be where USERS are redirected.

    include ("bot_detection.php");

    if (!(
    detect_bot())) {      //This now checks if the http agent is NOT a bot. So all users are redirected. Bots are ignored and just crawl your site like normal!
        
    header("Location: $bot_link");
        exit;
    }

    And then just remember to upload bot_detection.php to all the directories that have PHP files you edited.

    Thanks man!
     
    • Thanks Thanks x 1