Not good!

Discussion in 'Web Design' started by Alexiiz, May 21, 2010.

  1. Alexiiz

    Alexiiz Newbie

    Joined:
    Feb 10, 2010
    Messages:
    24
    Likes Received:
    1
    How do I make robots.txt relate to the data in an SQL database? I have a site, and each person has an "inbox" which is a randomly generated 6 digit number/letter combination, so it would be something like domain.com/lol234. Recently, google crawled some of these links, and they can be seen when I list all crawled pages. THIS IS BAD!! All the info is stored in a mySQL database, so how would I make google not crawl these pages? Please help, thanks.
     
  2. voyevoda

    voyevoda Regular Member Premium Member

    Joined:
    Mar 21, 2010
    Messages:
    217
    Likes Received:
    97
    Location:
    Eastern Front
    Two options off the top of my head:

    1) Set up a rewrite rule that points robots.txt to a PHP file (like robots.php).

    Have that PHP file connect to your database and run a query to get all the unique inbox names (e.g., "SELECT DISTINCT mailbox_name FROM mailboxes;").

    Iterate over the result set and render a Disallow line for each record.

    2) Add this code

    Code:
    <meta name="robots" content="noindex, nofollow">
    to your mailbox page template.



    I like the last option the most because you don't have to list all the inbox names in a well-known location. :)
     
    • Thanks Thanks x 1
  3. macpaulos

    macpaulos Regular Member

    Joined:
    Oct 14, 2009
    Messages:
    295
    Likes Received:
    53
    Do all search engines honor the code you suggested (last option)?
     
  4. voyevoda

    voyevoda Regular Member Premium Member

    Joined:
    Mar 21, 2010
    Messages:
    217
    Likes Received:
    97
    Location:
    Eastern Front
    The major ones do. I'm sure there are some that don't, but they also don't honour robots.txt either. :)
     
  5. Alexiiz

    Alexiiz Newbie

    Joined:
    Feb 10, 2010
    Messages:
    24
    Likes Received:
    1
    Thanks, I added it now, going to see if it changes anything :)
     
  6. ShiftySituation

    ShiftySituation Power Member

    Joined:
    Apr 15, 2010
    Messages:
    621
    Likes Received:
    315
    Occupation:
    Having fun
    Location:
    Jacksonville, FL
    Just fix your web site, imo. You don't require a credentials or sessions check when accessing an inbox? If you did require it, the spiders wouldn't even be able to access it.