1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[Help] How can I protect my site from HTTrack?

Discussion in 'Black Hat SEO Tools' started by andykim, Jul 21, 2017.

  1. andykim

    andykim Newbie

    Joined:
    Apr 28, 2014
    Messages:
    26
    Likes Received:
    0
    Hi everyone!

    I have a problem when many clone sites using HTTrack to copy my site. Anyone know how to stop it?

    Thanks.
     
  2. WestLex

    WestLex Newbie

    Joined:
    Jul 21, 2017
    Messages:
    2
    Likes Received:
    0
    Gender:
    Male
    Cloudflare, incapsula and similar services will slow down scrapper bots.

    Blocked.com script or something similar may help as well.
     
  3. andykim

    andykim Newbie

    Joined:
    Apr 28, 2014
    Messages:
    26
    Likes Received:
    0
    Thanks bro
     
  4. prot0

    prot0 Registered Member

    Joined:
    Jul 11, 2017
    Messages:
    84
    Likes Received:
    26
    Location:
    localhost
    You can't stop them in any way. Cloudflare and similiar services are just CDN and might help you hide server IP but this won't stop anyone copying your website.
     
  5. rickydzine

    rickydzine Regular Member

    Joined:
    Jun 6, 2011
    Messages:
    456
    Likes Received:
    235
    You can't stop someone from stealing the graphical representation of your website as it must be rendered in their browser for them to view it, you don't even need httrack, just rightclick view source and boom, there you have it an entire webpage. What you can do is put a lot of sneaky tactics into play and make it hard for a novice to rip your site as well as incorporate a bit of php to make it even harder for a novice ripper.
     
  6. prot0

    prot0 Registered Member

    Joined:
    Jul 11, 2017
    Messages:
    84
    Likes Received:
    26
    Location:
    localhost
    Also, if you're specifically targeting for HTTrack you can check if it requests the website contents using a particular User Agent and block it but, as @rickydzine said, it is absolutely not necessary to use HTTrack to copy a website.
     
  7. zuix33

    zuix33 Junior Member

    Joined:
    Jan 3, 2016
    Messages:
    194
    Likes Received:
    65
    Gender:
    Male
    With a small modification to your .htaccess file you can block a lot of website copiers.
    But they can copy manually if they want.
    Add the following code to your .htaccess file

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:[email protected] [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    RewriteRule ^.* - [F,L]
     
  8. JasonS

    JasonS Jr. VIP Jr. VIP

    Joined:
    Sep 15, 2012
    Messages:
    3,034
    Likes Received:
    929
    Hire a coder and tell him to write a code regarding url detection, like if your website link isn't detected in the url bar then redirect it to you website or show some specific message. Also obfuscate the original codes. Although your site can still be ripped off, but its a strong protection against tool kiddies.
     
    • Thanks Thanks x 1
  9. andykim

    andykim Newbie

    Joined:
    Apr 28, 2014
    Messages:
    26
    Likes Received:
    0
    Thanks Bro! I wrote a JS script, it will detect the domain and then redirect to my site if it's not my domain.
     
  10. redarrow

    redarrow Elite Member

    Joined:
    Apr 1, 2013
    Messages:
    5,154
    Likes Received:
    1,168
    There no real way , if it bothers you so much ask users to make a account to log into your website.

    There no other way.

    Just make sure all images are watermarked with your website url .