1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How do I scrape these URLs?

Discussion in 'Black Hat SEO' started by Inflow, Feb 6, 2010.

  1. Inflow

    Inflow Newbie

    Joined:
    Dec 20, 2009
    Messages:
    26
    Likes Received:
    42
    Hey I've been trying to use some scrapers for the following site:
    http://www.imdb.com/genre/comedy

    but every time I try to snag the Urls off of it it just pops up with all the websites other urls. I'm trying to get the urls to all the movies listed there.
     
  2. Kid Shaleen

    Kid Shaleen Regular Member

    Joined:
    Oct 29, 2009
    Messages:
    250
    Likes Received:
    63
    Try softhing like:

    View / Page Source (in Firefox)
    cut and copy what you want.
    paste it into some text editor like free portable PSPad that can handle column cuts
    do a bunch of search and replace tasks, e.g.
    search "href=" replace with line feed
    finally save file.
    then run some unix-style filter on it to give you only lines with "http" in them.
     
    • Thanks Thanks x 1