didnt give many results. I need a method to extract all business links.For Harvester check "Custom Footprint" in the box enter something like "Hair Dresser in Boston" click "Start Harvester" after that completes go to the "Grab" menu and select "Grab Emails From Harvester List" and you're done.
Just try similar to above, but then after you get your initial results, you need to extract other pages, as they may not have the email address on the given page you get back from google.
So you would want to trim the urls to root and remove duplicate domains. Then load them back into the keyword harvester box and do a site:
site:http://www.domain.com
And/or load them into the link extractor addon and extract internal links. Then the results back in 2 or 3 times or however many you like, and extract internal links.
Then load them into the urls harvested section, remove duplicate urls. Then export and randomize. Then go to >> import and add to - and import them and grab emails.
Its important that you use the import and add to and not the import and replace on that last step. The import and replace sorts the list in alphabetical order, so you are hitting the same domain with multiple simultaneous connections. That can result in IP bans or the site taking too long to respond, and then you lose that email.
Bringing them back in randomized lets you grab emails and spread out your hits among many domains at once.
Just try similar to above, but then after you get your initial results, you need to extract other pages, as they may not have the email address on the given page you get back from google.
So you would want to trim the urls to root and remove duplicate domains. Then load them back into the keyword harvester box and do a site:
site:http://www.domain.com
And/or load them into the link extractor addon and extract internal links. Then the results back in 2 or 3 times or however many you like, and extract internal links.
Then load them into the urls harvested section, remove duplicate urls. Then export and randomize. Then go to >> import and add to - and import them and grab emails.
Its important that you use the import and add to and not the import and replace on that last step. The import and replace sorts the list in alphabetical order, so you are hitting the same domain with multiple simultaneous connections. That can result in IP bans or the site taking too long to respond, and then you lose that email.
Bringing them back in randomized lets you grab emails and spread out your hits among many domains at once.
Thnx for the suggestion but i have tried this earlier and ended up with less than 50 emails. The problem is the footprint to get good results initially. Its not getting m far and i know should be at least a 1000
Well can you give some examples of pages you do want? I mean with the info you have Id search
hair dresser
I mean thats about the extent of it based on your info provided.
If you have a particular CMS that these hair dressers use then we can target that, or if there are commonalities among pages your looking for we can make a footprint, but you will have to provide several examples of pages you are looking for. Also more detail would be needed.
Making footprints is an art and it varies Dramatically from application to application of them. So the more detail you can give the better.
The one thing is, that if there is no commonalities between the pages you are looking for then your just dealing with sheer numbers of using a basic footprint, such as
hair dressers
However you can do things like
"hair dressers" "a"
"hair dressers" "b"
"hair dressers" "c"
"hair dressers" "1"
"hair dressers" "2"
"hair dressers" "3"
"hair dressers" "dallas texas"
etc... this gives you more volume of results and more finite control on a geo graphic area or whatever your looking for.
Also check your settings in sbox, because that could totally skew the entire result data set if your settings are messed up. Specifically you should look to make sure you results are set to 1000, that you are searching google.com and don't have a custom google set, and then go to settings >> use multi threaded harvester, and uncheck it. Then watch the status column as you harvest to make sure you aren't getting a bunch of 302 blocked on google.
Also then when your done you should make sure all keywords completed in green, else export the red ones and rerun them. Id use the single threaded harvester (unchecking the muli threaded harvester) in your situation anyway. As the single threaded harvester is built for accuracy and has stuff going on in the back end that the multi threaded harvester does not have. Mutli is built for volume and errors towards volume sacrificing accuracy if need be, which is how it should be. Single threaded errors towards accuracy sacrificing speed, which is also how it should be.
Edit:
After thinking on this a bit longer, perhaps you also taking about using like @hotmail.com so you could do things like.
"hair dressers" "a" "@hotmail.com"
"hair dressers" "b" "@gmail.com"
"hair dressers" "c" "email"
"hair dressers" "1" "e-mail"
"hair dressers" "2" "mail"
"hair dressers" "3" "contact"
"hair dressers" "dallas texas" "contact us"
etc...
I think I would start by scraping for like
"dental keywords" "Florida city" "email"
"dental keywords" "Florida city" "contact"
Then Id scrape out the emails you could, and save them off.
Then Id load the rest up in xrumer and turn on self learning. Create a comment like, hey I would like more info aobut dental services for my family.
Then comment the daylights out of the forms and put in a legit email where you can get responses.
Then use the self learning to work on any forms that it didn't recognize, filter out the successful so you aren't hitting them 10 times in a row. Then keep working on it.
Use scrapebox to scrape, Hrefer is poor for this type of scraping application and most scraping in general IMHO. Use Xrumer to post.
Then wait a week until you get responses. Then capture all those email addresses that respond and thats your gold.
Also don't run the xrumer blacklist against your scrapes, its built to weed out contact forms. I know I have hit an awful lot of contact forms like this. Also don't put any legit info into the campaign other then a throw away email addy. Planning how your going to scrape the email addys from the emails you receive ahead of time would be a good idea too, so you get a throw away email from somewhere you don't have to copy and paste your response email addys from.
Scrapebox is awesome for scraping, great flexibility, its great for posting via the learning mode addon on lots of different platforms, but the learning mode addon isn't built to quickly adapt to multiple different forms in 1 run, so xrumer would be a better choice in this instance for posting to the forms.
My 2 cents.