1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Get Robots.txt - STOP BAD BOTS FROM CRAWLING YOUR WEB PAGES

Discussion in 'Web Design' started by ddederick, Aug 14, 2013.

  1. ddederick

    ddederick Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 22, 2012
    Messages:
    584
    Likes Received:
    432
    Location:
    The Source Code
    Hi, just found this. It can be useful so I'm sharing.

    Upload this (or add it to) robots.txt to your root folder, it will stop bad bots from crawling your web pages.

    Cheers

    Download http://www.sendspace.com/file/9gi8wo

    P.S. It's a .txt file so I think virus scan is not necesarry :D
     
  2. moshapollo

    moshapollo Newbie

    Joined:
    Jul 18, 2013
    Messages:
    4
    Likes Received:
    0
    thanks! robots.txt is an important file for google. i use it for every blog i have
     
  3. neutralhatter

    neutralhatter Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 23, 2010
    Messages:
    430
    Likes Received:
    330
    Dude, robots.txt won't stop spiders or bots unless they respect it.
     
    • Thanks Thanks x 2
  4. Vic Sage

    Vic Sage Jr. VIP Jr. VIP

    Joined:
    Sep 5, 2010
    Messages:
    1,716
    Likes Received:
    2,110
    Gender:
    Male
    Occupation:
    Scientist Performing Marketing Experiments
    Better do it via .htaccess.
     
    • Thanks Thanks x 1
  5. seeplusplus

    seeplusplus Power Member

    Joined:
    Aug 18, 2008
    Messages:
    511
    Likes Received:
    163
    Yep, they don't even look to see if you have one, or maybe some do, to see where you DON'T want them to go, such as an admin directory or something.
     
  6. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,290
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
  7. gullsinn

    gullsinn Jr. VIP Jr. VIP Premium Member

    Joined:
    Dec 24, 2009
    Messages:
    2,429
    Likes Received:
    2,210
    Gender:
    Male
    Occupation:
    Jobless :D
    Location:
    Graveyard
    Home Page:
    No mate, that's not true!
     
  8. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,290
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
    yep, mate it is true :)

    robots.txt tells G not to index the pages, but doesn't stop him from crawling them and OP is sharing the script as a "stop crawling" measure..
     
  9. natorob

    natorob Junior Member

    Joined:
    Jul 7, 2011
    Messages:
    189
    Likes Received:
    63
    Occupation:
    Job?!?!?
    Location:
    Denver CO
    Home Page:
    I saw a WP plugin that actually does this better than a robots.txt file. Haven't bought it yet, but it's on my list. I believe the name of the plugin is SpyderSpanker (or similar). This allows you to block the BS crawlers like Yandex etc from crawling your site... May be the solution you are looking for. But cannot vouch for it since I haven't bought/used it yet....
     
  10. TZ2011

    TZ2011 Senior Member

    Joined:
    Jun 26, 2011
    Messages:
    832
    Likes Received:
    864
    Occupation:
    Cleaning servers
    you don't need to pay for such a basic stuff. just doing a google search for .htaccess protection, blocking bots and even hack protection (5G firewall list) will do better job than any plugin.
     
  11. ddederick

    ddederick Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 22, 2012
    Messages:
    584
    Likes Received:
    432
    Location:
    The Source Code
    Of course it can't stop all bots. But it will stop all the bots you can find in this txt file. I guess it can help a bit at least :) Never mind
     
  12. WizGizmo

    WizGizmo Super Moderator Staff Member Premium Member

    Joined:
    Mar 28, 2008
    Messages:
    3,845
    Likes Received:
    55,442
    Since this is not an Internet Marketing product,
    I have moved this thread to the Web Design section.

    "Wiz"
     
  13. ShabbySquire

    ShabbySquire Power Member

    Joined:
    Nov 30, 2011
    Messages:
    574
    Likes Received:
    122
    Location:
    UK
    I use password protect directories in cpanel. Unless SE crawlers have the capacity to brute force a password, then nothing is shown.
     
  14. sm754

    sm754 Registered Member

    Joined:
    Mar 21, 2012
    Messages:
    93
    Likes Received:
    38
    Occupation:
    Farmer
    Location:
    Azerbaijan
    If you want to be a real hindrance to bots, there's software out there that will simulate a tree of pages with hundreds of fake e-mail addresses; hitting one of those things can really mess up a spammer database.