Legit news scraping sources

Discussion in 'Blogging' started by jimboj, Nov 24, 2008.

  1. jimboj

    jimboj Newbie

    Nov 24, 2008
    First post here so hello to everyone :)

    So I'm planning to build a bunch of autoblogs soon and I'm trying to keep it as whitehat as possible.

    What I mean is, I want to only scrape feeds with excerpts, no full posts, I want to link back with a "read the rest here..." and, this is the big one, I want to scrape only from sites that actually allow syndication.

    So, let's say I build a blog on "cellulite", what I want to do is scrape news topics on everything cellulite related but with the conditions listed above.

    Anyone know what I mean here?

    Where would I find feeds that won't mind me syndicating there content?

    The best I've found so far is google news but they don't allow syndication on third party sites and I don't want to have to remove content in the future.

  2. tekg0r

    tekg0r Junior Member

    Nov 9, 2008
    You want to parse feeds and strip anything that matches keyword filters?

    If this is the case, you are into some custom coding. Though if someone has a tool to do this, it would be of great interest to many of us I'm sure.
  3. bl4ck1ce

    bl4ck1ce Regular Member

    Oct 28, 2008
    Web Design & Marketing
    British Columbia, Canada
    Armand Morin talked about this in his IME course, but he didn't give many details about where to find syndicatable(is that a word?) feeds, so I emailed him to ask.. it's been about 9 months with no response so far. :(

    I too would like to know where I could scrape news feeds from without having to worry I'll get caught.. so far all of my autoblogs are using PLR content and not RSS.