1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping Full posts?

Discussion in 'Blogging' started by havoc21392, Dec 5, 2008.

  1. havoc21392

    havoc21392 Newbie

    Joined:
    Nov 14, 2007
    Messages:
    49
    Likes Received:
    2
    im looking for a RSS scraping tool that can post FULL POSTs not only excerpts. Don't worry the niche i am in . Do not care much or even at all for duplicated content in my opinion.

    If someone finds an auto posting wordpress plugin which gets full posts , i would really appreciate it.( The only ones i found take excerpts like smartrss)

    Thank you for your time.
     
  2. undeterminederror

    undeterminederror BANNED BANNED

    Joined:
    Mar 31, 2008
    Messages:
    630
    Likes Received:
    457
    ițm working at this type of script right now. :D
     
  3. simplybebop

    simplybebop Regular Member

    Joined:
    Oct 24, 2008
    Messages:
    371
    Likes Received:
    177
    Location:
    Greesnboro, NC
    Its because the person you are taking the content from isnt publishing full articles.
     
  4. istarapol

    istarapol Junior Member

    Joined:
    Jun 3, 2008
    Messages:
    110
    Likes Received:
    228
    Occupation:
    Graphic Designer
    Location:
    Under Your Bed
    Home Page:
    simplybebop is right. Even if you get the RSS Grabber at a lunatic pace on grabbing content, and the target blog decides to publish partial content that's all you're getting.
     
  5. havoc21392

    havoc21392 Newbie

    Joined:
    Nov 14, 2007
    Messages:
    49
    Likes Received:
    2
    Oh so the excerpt depend from the RSS feed of the website and not the plugin itself. Is that it?
     
  6. neta1o

    neta1o Regular Member

    Joined:
    Sep 29, 2008
    Messages:
    388
    Likes Received:
    318
    Home Page:
    Yep, some RSS feeds only publish excerpts so you have to go to the original site link to get the full article content.
     
  7. yeti_racer

    yeti_racer Junior Member Premium Member

    Joined:
    Dec 3, 2008
    Messages:
    192
    Likes Received:
    87
    Location:
    Hick Ville
    Right. Any many sites I think have gotten tired of people just scrapping their full RSS feeds, so they switched to just short excerpts. Hell, one of the feeds I used to use switched to only putting post titles in their feeds about 2 weeks ago.
     
  8. istarapol

    istarapol Junior Member

    Joined:
    Jun 3, 2008
    Messages:
    110
    Likes Received:
    228
    Occupation:
    Graphic Designer
    Location:
    Under Your Bed
    Home Page:
    yup
    basically, in wordpress if they selected the summary, you'll be only be able to scrape a partial content from their site.
    [​IMG]
    [​IMG]
     
  9. teague

    teague Junior Member

    Joined:
    Nov 5, 2008
    Messages:
    119
    Likes Received:
    48
    Occupation:
    don't have one
    Location:
    USA
    There's also a wordpress plug-in called RSS Footer. I use it for my blogs. With the plug-in, you can position a sentence either before or after your RSS feed (I state 'original content is from....)

    AND, the plug-in has a toggle to either include, or not, your backlink onto your RSS.

    Scraping RSS content from a wordpress blog using this plug-in, I think, would be fruitless.
     
  10. garfield

    garfield Registered Member

    Joined:
    Aug 27, 2007
    Messages:
    70
    Likes Received:
    63
    RSSMagician pulls FULL articles from the article directories based on YOUR chosen keywords. You can then manipulate these articles ANY way you want. NO OTHER PROGRAM AROUND DOES THIS (this feature is HUGE).

    hxxp://3w.rssmagician.cOm/ :)



     
  11. undeterminederror

    undeterminederror BANNED BANNED

    Joined:
    Mar 31, 2008
    Messages:
    630
    Likes Received:
    457
    you can scrap full RSS content from 90% of the sites that offering excerpts only. for example Asociated Content ... con: i used spinners; synonimizers, rewriters, but google still not inloved my content. pro: i'm still indexed, works automated and very slow still growing in serps. someone gives me a good idea how to make the content unique and i'll show how to take full text rss from almost every blog.
     
  12. emgxxg

    emgxxg Registered Member

    Joined:
    Nov 3, 2008
    Messages:
    93
    Likes Received:
    523
    You can definitely extract full rss posts, there are a number of ways to get the job done, I have not done this in a while so will need to focus a bit more on explaining, will come back and give some options later. though in the mean time you can cheack out this option:
    Code:
    hxxp://www.devtrench.c0m/how-to-scrape-an-entire-wordpress-blog/
    Along with this if you can use one of the free scrappers to convert the HTML created to an rss feed. For Example:
    Code:
    hxxp://www.feedyes.c0m/
    Also, a little while ago I had made a php script to automate a task for a client (scrapping certain text from his own blog). It can definitely be trained to scrape content from other blogs to, will see if I have it somewhere in my archives (they are all over the place). If I do find it you will need you to make basic changes to it depending on the theme the blog getting scrapped is using. Nothing too much!!

    Cheers!!
     
    • Thanks Thanks x 1
  13. nikhil88

    nikhil88 Regular Member

    Joined:
    Jun 12, 2008
    Messages:
    208
    Likes Received:
    97
    Occupation:
    E&TC Engineering Student
    Location:
    Pune,India
    i'm not sure about wordpress plugins but u can use this site to convert partial feeds to full
    HTML:
    http://labs.echoditto.com/fulltextrss
     
  14. emgxxg

    emgxxg Registered Member

    Joined:
    Nov 3, 2008
    Messages:
    93
    Likes Received:
    523
    I checked it out... it works great!! You guys could actually add multiple feeds into one, to make the process quicker using services similar too.

    Code:
    hxxp://www.rssmix.com/
    Great find nikhil!!
     
    Last edited: Dec 12, 2008
  15. undeterminederror

    undeterminederror BANNED BANNED

    Joined:
    Mar 31, 2008
    Messages:
    630
    Likes Received:
    457
    i use something like echoditto but better :D however this topic becomes more and more interesting.
    ok, how to make it unique ?
     
  16. battman323

    battman323 Regular Member

    Joined:
    Aug 10, 2007
    Messages:
    449
    Likes Received:
    395
    Occupation:
    Extortionist
    Location:
    None of your damn business
    Scraping full posts can be done with expensive tools like RSS2Blog but you can do it with free plugins like WP-O-Matic too, you just have to find the right feed. You can't just pick any feed, it has to be a feed that syndicates the entire post in the feed and using that kind, WP-O-Matic will post the entire posts that appear in the feed.
     
  17. nikhil88

    nikhil88 Regular Member

    Joined:
    Jun 12, 2008
    Messages:
    208
    Likes Received:
    97
    Occupation:
    E&TC Engineering Student
    Location:
    Pune,India
    i completely agree...its easy with just a little coding....
    the problem though lies in making the content unique... using translations is one way, but like it has been discussed many times before, the content does become a little unreadable
     
  18. simplybebop

    simplybebop Regular Member

    Joined:
    Oct 24, 2008
    Messages:
    371
    Likes Received:
    177
    Location:
    Greesnboro, NC
    Shouldnt have shared that, we need to keep the tools sorta hidden so noobs dont mess them up.
     
  19. emgxxg

    emgxxg Registered Member

    Joined:
    Nov 3, 2008
    Messages:
    93
    Likes Received:
    523
    While I would agree with you, that some tools should be kept hidden from noobs, though unfortunately this tool has been mentioned on BHW at least 3 times.

    Cheers!!
     
  20. nikhil88

    nikhil88 Regular Member

    Joined:
    Jun 12, 2008
    Messages:
    208
    Likes Received:
    97
    Occupation:
    E&TC Engineering Student
    Location:
    Pune,India
    my bad...was just trying to help...
    though like emgxxg said its already been shared here
    anyway will be more careful next time