1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Web Page Clone

Discussion in 'Black Hat SEO' started by billyh7, Oct 31, 2009.

  1. billyh7

    billyh7 Junior Member

    Joined:
    Dec 18, 2008
    Messages:
    162
    Likes Received:
    2
    Not saying I would ever copy somebodys web page but I'm just curious.

    Is there a way I can copy another web site page? Like just the content that the user updates. This page gets updated 15-20x a day and instead of looking at the page all day and copy and pasting it into my site. Is there a tool or coding that can be done to go out and pull it into my site??
     
  2. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    For this, you'd probably want to use a scraper, which would need to be coded specifically for the site that you want to scrape. You could then develop a tool that would automatically scrape the site at set intervals. Or, if the site has an RSS feed, you could simply add the feed to your reader so that you'll be instantly notified when the site is updated, and you can then go and run your scraper on it.
     
  3. billyh7

    billyh7 Junior Member

    Joined:
    Dec 18, 2008
    Messages:
    162
    Likes Received:
    2
    The site has no RSS feed. Do you know of any good scraper tools?
     
  4. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    A web scraper needs to be coded specifically for the site that you want to scrape. If you don't code it yourself, prepare to pay a premium for either scraper tools or services.

    Examples:

    Scrapegoat --> From their FAQ page: "We generally don't accept any projects below about $750."

    VelocityScape Web Scraper Plus+ --> "The Most Powerful Web Data Extraction Platform Under $50k... and it's only $749"

    screen-scraper.com --> "Let us do the work for you! Get a free quote/estimate on your project" (Or, "If you have to ask, you can't afford it.")

    Now there is a free tool called XScrape, but you have to be skilled at regular expressions in order to use it. Any free scraper is going to make you do the bulk of the work. Furthermore, if the web site that you want to scrape changes its HTML code, you may need to update your scraper to accommodate the changes.

    If you decide that you want to roll your own, here's a good book to check out:

    Code:
    http://nostarch.com/webbots.htm