1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to edit robot.txt & htaccess to prevent robots crawling/indexing images on your site

Discussion in 'Black Hat SEO' started by workplay, Jan 30, 2012.

  1. workplay

    workplay Junior Member

    Joined:
    Dec 2, 2011
    Messages:
    122
    Likes Received:
    27
    Hey guys,

    I'm trying to used non-copyrighted images on my site but I want to take another precaution and edit my robot.txt and htaccess file to prevent bots from indexing my images on search engines and so forth.

    Anyone know how to edit the files to prevent the images showing up in search engines?
     
  2. workplay

    workplay Junior Member

    Joined:
    Dec 2, 2011
    Messages:
    122
    Likes Received:
    27
    anyone? :scratchch
     
  3. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,936
    Occupation:
    Design director
    Location:
    Paris (France)
    robots.txt :
    Code:
    User-Agent: *
    Disallow: /private_images/
    
    User-agent: Googlebot-Image
    Disallow: /private_images/
    beny
     
    • Thanks Thanks x 1
  4. workplay

    workplay Junior Member

    Joined:
    Dec 2, 2011
    Messages:
    122
    Likes Received:
    27
    does that mean the images have to be in the folder called "private_images"?

    also, no need to edit htaccess?
     
  5. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,936
    Occupation:
    Design director
    Location:
    Paris (France)
    In my example, all files and images in folder "private_images" will not be indexed by search engine and Google Images.
    The .htaccess it's more to deny access to a file or a folder (to search engine and users).

    Beny
     
    • Thanks Thanks x 1
  6. workplay

    workplay Junior Member

    Joined:
    Dec 2, 2011
    Messages:
    122
    Likes Received:
    27
    User-Agent: *
    Disallow: /private_images/

    So, would that also prevent other crawlers from other image search engines (e.g. tineye)?
     
  7. Quazpolter

    Quazpolter Junior Member

    Joined:
    Dec 29, 2011
    Messages:
    114
    Likes Received:
    23
    I'm doing something like this

    Isn't
    User-Agent: * Disallow: /private_images/

    enough on its own?
     
  8. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,936
    Occupation:
    Design director
    Location:
    Paris (France)
    Yes, and you can find more infos about Robots.txt structure :
    Code:
    http://en.wikipedia.org/wiki/Robots_exclusion_standard
    Beny
     
    • Thanks Thanks x 1
  9. orlandolongwood

    orlandolongwood Junior Member

    Joined:
    Aug 16, 2009
    Messages:
    137
    Likes Received:
    85
    Occupation:
    failed novelist
    Location:
    Austin, TX
    Where are you getting "copyright-free" images? Outside of some(not all) images from the US Government, these are extremely rare. Even Creative Commons are not "copyright-free," they are in many cases fee-free. Some "free" images from royalty-free fee-free sites still require attribution.

    Copyscape does not respect robots.txt. Neither does Getty's private crawler.
     
  10. workplay

    workplay Junior Member

    Joined:
    Dec 2, 2011
    Messages:
    122
    Likes Received:
    27
    I meant fee free with author attribution requirement.. i guess theres no easy out.. either pay up or make OBL to the author