1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[TUT] Quick Method to Scrape/Convert Web 2.0 Properties

Discussion in 'Black Hat SEO' started by Null0x00, Aug 29, 2012.

  1. Null0x00

    Null0x00 Newbie

    Joined:
    Dec 26, 2011
    Messages:
    19
    Likes Received:
    16
    Occupation:
    Software Engineer
    Location:
    Florida
    Home Page:
    This is a quick method to grab a site and convert it to standard html/css so you can reuse the theme. This is especially good for making stand alone sites from Web 2.0 properties.

    Its a linux one-liner for your techy folks:

    Code:
    wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-name=windows --domains <domain> -erobots=off --no-parent <site url>
    
    Make sure to replace <domain> with the actual domain name and <site url> with the full site url (without the < >'s). This will recursively follow internal links, grab all the resources (css/javascript), convert any non-html extensions (.php, .aspx, .jsp, etc) to .html, and bypass the robots.txt in case its blocking access. This is a beauty if you want to quickly snag a site for reuse.

    HTH!


    -0x00
     
    • Thanks Thanks x 2