tutorial - how to extract CL emails with Y4h00 p1pes

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
I just shared this on another forum I post on, because I was getting lots of questions about it and I decided I can't forget about BHW so I'm posting this for you guys as well. Hope someone finds it useful!

Sample Output:
80854583ic3.jpg


Source Code:
67i7ep.jpg


Explanation:
1) Fetch the RSS feed of the section you want to scrape.
2) Follow each link in the feed and extract the mailto: tag
3) Copy the tag as the item.title & item.description so it's readable in the pipe output
4) Output the pipe.

Ideas for improvement & better usage:
-decode the actual encoded link to make the data more workable
-output the pipe as a different format other than rss to make the data more workable (html, csv, json, xml, etc.)
-add support for multiple inputs
-add more regex-fu to support extracting titles & categories to their own fields
-subscribe to the feed with a feed reader and automatically build your list
 

blackjack

Regular Member
Joined
Nov 23, 2007
Messages
276
Reaction score
53
Hi Blazed

This looks great. Is there any way I can learn more about Yahoo Pipes?

So you suggest any book on this? I have few ideas. one like yours but taking an item & price from craigslist then getting the price from eBay for the same or similar item.

Any help would be appreciated very much.

Thanks
 

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
pipes.yahoo.com has everything you need to get started. I think they even have a video tutorial as an intro. Just dive right in and experiment.

Also there is a really good tutorial about scraping with yahoo pipes that you can find here:

http://www.daybarr.com/blog/2007/12...g-the-fetch-page-module-to-make-a-web-scraper

There is also a great service called dapper.net which lets you scrape things in a point and click way that is even easier, but because it's so easy you have less control and cant write your own regex's for example. However, you can always create a feed with dapper and use it as an input to yahoo pipes so that you can get up close and personal with the data.


Hi Blazed

This looks great. Is there any way I can learn more about Yahoo Pipes?

So you suggest any book on this? I have few ideas. one like yours but taking an item & price from craigslist then getting the price from eBay for the same or similar item.

Any help would be appreciated very much.

Thanks
 

mrewen

Newbie
Joined
Feb 16, 2008
Messages
42
Reaction score
5
Wow this is a hell of a share...
thanks mate... this will save alot of time...
any suggestions on how would i defeat CL's relay system? I keep getting (You have sent many email, we will stop relaying) email...
I used like 5 accounts today, any better suggestions?

thanks :)
 

Arthas

BANNED
Joined
Jan 5, 2009
Messages
637
Reaction score
329
check this out.. my upcoming tool: http://screencast.com/t/zJjJPXdx
 

nikhil88

Regular Member
Joined
Jun 12, 2008
Messages
208
Reaction score
98
hey blazed
could u tell me how i can extract full articles from ezinearticles?? i've tried a lot but cant seem to do things right :p
 

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
I will look into this and update this thread if I get anywhere :)


hey blazed
could u tell me how i can extract full articles from ezinearticles?? i've tried a lot but cant seem to do things right :p
 

nikhil88

Regular Member
Joined
Jun 12, 2008
Messages
208
Reaction score
98
thanks blazed, but i just managed to do it....cheers
 

bourder

Junior Member
Joined
Nov 21, 2008
Messages
118
Reaction score
41
Wow this seems like a great tool. I would love to know if it is possible or anyone has used this to scrape e-mails off of facebook profiles or from forums. If anyone knows I would love to hear the answer!
 

ashenjr

BANNED
Joined
Oct 17, 2008
Messages
49
Reaction score
16
How do you get the code to make a yahoo pipe a rss feed. I can create the feed and get to the point of making it ready, but don't know how to go about getting it to be an rss feed? Maybe I am missing something? Thanks
 

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
How do you get the code to make a yahoo pipe a rss feed. I can create the feed and get to the point of making it ready, but don't know how to go about getting it to be an rss feed? Maybe I am missing something? Thanks

after you save the feed, you can go to "my pipes", then chose the one you just saved and click on "run pipe" - there will be an rss icon that you can click on to get the RSS url.
 

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
Wow this seems like a great tool. I would love to know if it is possible or anyone has used this to scrape e-mails off of facebook profiles or from forums. If anyone knows I would love to hear the answer!

You can definitely use y*hoo p*pes for this... I havent tried to scrape any sites yet that require authentication, but I can't imagine it to be much harder. You can always go to dapper.net that easily handles authentication and then just fetch the output of that dap with a pipe and do what you have to do.
 

coxi999

Junior Member
Joined
Aug 14, 2007
Messages
170
Reaction score
53
Great post man i will give this a go and report back.
 

scratbandit

Newbie
Joined
Feb 13, 2008
Messages
40
Reaction score
12
Thank you very much for the tutorial. It is the first one I have tried to get going and actually got it to work. Only if i knew a way to decode the email address from the results.

Thanks again for the great tutorial...
 

spikeanut

Registered Member
Joined
Jan 2, 2009
Messages
99
Reaction score
31
Age
41
So why wouldn't you use the On demand craigslist scrapper? It automates everything.. The program will scrape craigslist emails from whatever city/category you want. Unlimited uses, just a matter of how you figure out how to beat craigslist system on their crappy email addresses.

(link removed by admin - software belongs to another member)
 

goldminer

Junior Member
Joined
May 28, 2008
Messages
129
Reaction score
84
So why wouldn't you use the On demand craigslist scrapper? It automates everything.. The program will scrape craigslist emails from whatever city/category you want. Unlimited uses, just a matter of how you figure out how to beat craigslist system on their crappy email addresses.

I got error when I run your program.
 

goldminer

Junior Member
Joined
May 28, 2008
Messages
129
Reaction score
84
Is there any easy tuturial for a noobs about this? thanks in advance.
 

spikeanut

Registered Member
Joined
Jan 2, 2009
Messages
99
Reaction score
31
Age
41
Wait, is that link for sale? Sorry MODS if it was.
 

blazed

Junior Member
Joined
Aug 15, 2008
Messages
178
Reaction score
122
Thank you very much for the tutorial. It is the first one I have tried to get going and actually got it to work. Only if i knew a way to decode the email address from the results.

Thanks again for the great tutorial...

Well the best part is that you don't really need to decode them, because you are pulling the html tag that actually displays the email on the CL page. If you wanted to you can only include the item.title field and render the feed in html... then you just have a list of links that you can copy and there is no need to decode.
 
Top