1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

how to redirect HUMANS and make bots crawl with htaccess?

Discussion in 'Cloaking and Content Generators' started by cheesecake, Feb 2, 2009.

  1. cheesecake

    cheesecake Regular Member

    Joined:
    Jan 12, 2009
    Messages:
    270
    Likes Received:
    229
    anyone know how to just let spiders crawl my site and make every "non spider" go to somwhere else?

    I think the best way to do it is with an ip list and everybody not in that list will get redirected.


    I think the best way is with .htaccess, I just don't know the correct syntax.


    any pointers?
     
  2. cooooookies

    cooooookies Senior Member

    Joined:
    Oct 6, 2008
    Messages:
    1,008
    Likes Received:
    216
    yep, all right so far.

    get an ip list (free... search the forum, commercial: I use the fantomaster bot list)

    I have a small php-script which decides if to redirect or not. my bot ips are in a database and daily updated.

    <?php if not bot header(bla....) ...
     
  3. cheesecake

    cheesecake Regular Member

    Joined:
    Jan 12, 2009
    Messages:
    270
    Likes Received:
    229
    thanks coooookies..

    the only problem I see with this is that maybe some poeple can search for your website on google and check the cached site... which will be obviously what you let google see but now humans can see as well.

    unless there is some fancy I-don't-know-about way to even cloak the cache :)
     
  4. richman

    richman BANNED BANNED

    Joined:
    Jan 8, 2009
    Messages:
    321
    Likes Received:
    308
    how many total of gogole bot ?
     
  5. cheesecake

    cheesecake Regular Member

    Joined:
    Jan 12, 2009
    Messages:
    270
    Likes Received:
    229
    how many google ips I have?
     
  6. Entrepreneur

    Entrepreneur Regular Member

    Joined:
    Oct 12, 2007
    Messages:
    438
    Likes Received:
    379
    You need to do 2 things, be prepared to be booted out of Google for cloaking, and use the noarchive meta to ensure the page isn't cached, but is still spidered.

    You can get the ip lists from iplists.com, and i'm sure i have this script in some form within one of my php books. I'll have a look.
     
  7. Entrepreneur

    Entrepreneur Regular Member

    Joined:
    Oct 12, 2007
    Messages:
    438
    Likes Received:
    379
    Although i say that, but have actually commented not necessarily with your direct question in mind! :)

    The script and work i was thinking of was to show all pages to spiders, but if they're not a member when they click through from Google, they get redirected to a login page.
     
  8. cooooookies

    cooooookies Senior Member

    Joined:
    Oct 6, 2008
    Messages:
    1,008
    Likes Received:
    216
    Exactly. Use noarchive tag. And think twice which list you want to use.

    If you are in the testing phase, try whateveriplist.com, later I would really recommend you to pay for the fantomaster list, it is just the best one I strongly assume. I have recently bought and filled 100 domains and would not like to loose them in a matter of minutes.