1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to get full content with RSS feeds

Discussion in 'Blogging' started by back2black, Feb 2, 2009.

  1. back2black

    back2black Junior Member

    Joined:
    Dec 20, 2008
    Messages:
    115
    Likes Received:
    28
    Hi Guys,

    Anyone know how to get the full content when you make wordpress posts from RSS feeds?

    Im using autoblogged at the moment - Ive tried changing the tags (i.e. %post% %description% %content% etc) but the actual feed doesnt have the full content... is there a workaround for this? (fyi my rss is from bbc.c0.uk)

    Cheers all!
     
  2. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    If you're pulling data from another site's RSS feed, you're limited by what their feed provides. If it doesn't give the full article, then you can't force it to do so.
     
  3. almir

    almir Power Member

    Joined:
    Jul 11, 2008
    Messages:
    728
    Likes Received:
    229
    Actually, I saw a script that puled whole content once. It was online then to test, but if I remember good, he pulled it back and retired.
     
  4. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    If it was pulling full content, then it was probably using the RSS feed to get the URL of the article so that it could scrape the content from there. If the feed's creator chooses not to display the full content (by structuring the RSS feed, which is an XML file, in such a way that only part of the article is given), then you won't be able to retrieve this information from the XML file itself.
     
  5. AKOEAJA

    AKOEAJA Newbie

    Joined:
    Mar 12, 2008
    Messages:
    34
    Likes Received:
    48
    Location:
    Jakarta, Indonesia
    Try dapper.net, make their pages into rss using dapper.net and feed to your blog.

    Cheers
     
    • Thanks Thanks x 3
  6. blackhaze

    blackhaze Power Member

    Joined:
    Jan 11, 2008
    Messages:
    661
    Likes Received:
    167
    Occupation:
    self made millionaire
    Location:
    in the matrix
    Home Page:
    you would need to have/write a scraper, but its not that easy.

    You get the URL of the feed, then follow the URL and get to the actual page. Now you need to parse the page (and every page could have a different way how its built) to extract the actual content.

    the difficulty is that one scraper might work with one feed/source...but not with the other because you always need to know the patterns, where and what to extract to get the actual content.
     
  7. simplybebop

    simplybebop Regular Member

    Joined:
    Oct 24, 2008
    Messages:
    371
    Likes Received:
    177
    Location:
    Greesnboro, NC
    I have a copy of that script. But it too powerful to be released anywhere.

    Plus that would hurt my earnings.
     
  8. zozor

    zozor Junior Member

    Joined:
    Dec 24, 2008
    Messages:
    113
    Likes Received:
    70
    If you want to have full content of a rss feed this is easily doable with yahoo pipes.
    If you want to have full content of blogsearch,technorati etc this is impossible
     
  9. nikhil88

    nikhil88 Regular Member

    Joined:
    Jun 12, 2008
    Messages:
    208
    Likes Received:
    97
    Occupation:
    E&TC Engineering Student
    Location:
    Pune,India
    if u are using only bbc.c0.uk rss feeds, then u could write a scraper script. but this may not work on any other feed sources...
     
  10. freller

    freller Regular Member

    Joined:
    Sep 26, 2008
    Messages:
    210
    Likes Received:
    65
    If it's not in the RSS XML feed in the first place, there's nothing (including Yahoo Pipes) that's going to put it there. However, the individual URL links in the XML feed take you back to the original article which is where software can help.
     
  11. zozor

    zozor Junior Member

    Joined:
    Dec 24, 2008
    Messages:
    113
    Likes Received:
    70
    Freller sorry but you are wrong
    Heres how I do with yahoo pipes :

    Fetch feed url :
    For each item in feed
    Fetch item.link
    Cut content from "" to ""
    Assign to item.description
    Feed output

    and voila you got the full feed. Take 5 minutes work with AC,ezinearticles and any blogs
     
    • Thanks Thanks x 2
  12. LazyAffiliate

    LazyAffiliate BANNED BANNED

    Joined:
    Jan 25, 2008
    Messages:
    59
    Likes Received:
    66
    ContentSolution.com scrapes from EzineArticles.com and a few other article sites and exports those articles in .TXT .HTML or RSS.
     
  13. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    What freller and others, including myself, have said is correct. Yahoo! Pipes can get the full feed, but it doesn't have some magical way of extracting data out of thin air. It finds an entry's URL in the site's feed, and then it goes to that URL and grabs the full content. Here's a brief explanation of how Yahoo! Pipes pulls the full feed:

    Code:
    Fetch Full Feed
    Replaces abbreviated description with full content fetched from webpage using link entity. Uses regex to strip out just the meaningful content.
    
    Source: http://pipes.yahoo.com/feedjournal/fetchfullfeed
    In other words, Pipes is a scraper, as are Dapper and other applications that claim to get the full content from a feed that offers just the item descriptions or truncated articles.
     
    • Thanks Thanks x 4
  14. back2black

    back2black Junior Member

    Joined:
    Dec 20, 2008
    Messages:
    115
    Likes Received:
    28
    Guys, thank you so much for your help... honestly this community is great....Im going to check out pipes now, sounds like a good content solution for autoblogging
     
  15. blackhaze

    blackhaze Power Member

    Joined:
    Jan 11, 2008
    Messages:
    661
    Likes Received:
    167
    Occupation:
    self made millionaire
    Location:
    in the matrix
    Home Page:
    well i know about Y pipes and dapper, but arent all those banned w/ Google? Eg scraping Google news and similar...
     
  16. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    Yeah, G was/is blocking Pipes from scraping G news:

    Code:
    http://www.google.com/#hl=en&q=google+blocking+pipes&btnG=Google+Search&aq=f&oq=google+blocking+pipes&fp=3WTwdsC3GPc
     
  17. blackhaze

    blackhaze Power Member

    Joined:
    Jan 11, 2008
    Messages:
    661
    Likes Received:
    167
    Occupation:
    self made millionaire
    Location:
    in the matrix
    Home Page:
    well i tend more and more in the direction that IF you scrape you should scrape content which is NOT indexed.

    So scraping google is about the worst thing you can do.

    There ARE ways to get content which is not indexed if you have sources with very OLD content....i am also looking into the option to scrape Google Usenet/Groups archive. I dont think that messages from 1985 are still in their active index?

    The other idea is actually using something like yahoo groups or forums and scrape content as soon as it appears in the forum before SE even had a chance to index it.

    Yahoo mailing lists/groups is also very good since the content never makes it into google index.
     
  18. doug

    doug Newbie

    Joined:
    Nov 4, 2008
    Messages:
    34
    Likes Received:
    103
    fresh content is where it's at. with more and more places getting into ghostwriting services, the price per article is dropping lower and lower.
     
  19. back2black

    back2black Junior Member

    Joined:
    Dec 20, 2008
    Messages:
    115
    Likes Received:
    28
    Ah! Now - this is interesting - Ive just setup a blog the complete opposite way Im used to - Im doing it a whitehat way... Ive been paying for original content and Im going to try and really make a go of it for 2 reasons:

    1) It will be easier to rank in google as its not copied content
    2) With half decent content I hope to build up a community who actually want to come back to my site to hear what I have to say.
    2) Its easier on my conscience! :D

    (well, this is very wh for a bh forum I guess!)

    Im paying $3 per article - do you think this is reasonable - how much are you paying?
     
  20. lostboy_1

    lostboy_1 Registered Member

    Joined:
    Jan 22, 2008
    Messages:
    55
    Likes Received:
    49
    articlepros.c0m gives you full text feeds