1. This website uses cookies to improve service and provide a tailored user experience. By using this site, you agree to this use. See our Cookie Policy.
    Dismiss Notice

Best eMail Harvesting Tool Recommendation

Discussion in 'Black Hat SEO Tools' started by Darren Amato, Oct 10, 2019.

  1. Darren Amato

    Darren Amato Newbie

    Joined:
    Oct 10, 2019
    Messages:
    5
    Likes Received:
    0
    Gender:
    Male
    I have ScrapeBox and I'm demoing Atomic eMail Hunter. Ideally what I am trying to accomplish is a more targeted approach to list building than just pure SPAM. I'll give an example of what I want to do.

    Let's say I want to build a list of contacts that are part of an organization that has a website that lists those contacts. But, they may only list the websites of the contacts. And in order to see all of the organizations there is the 1,2,3,4,NEXT links. I need a tool that is smart enough to click through all of those links and harvest the websites of those organizations, and then, either in the same task or one later, crawl those websites and pull the emails.

    I don't believe Scrapebox is good enough to do this, nor is Atomic eMail Hunter. Does anyone have any other suggestions.

    I don't mind a one time payment or an annual payment, I just don't need another monthly fee.
     
  2. MaestroDelWeb

    MaestroDelWeb Senior Member

    Joined:
    Nov 5, 2007
    Messages:
    925
    Likes Received:
    924
    Occupation:
    Jack of all trades.
    Location:
    USA
    ScrapeBox can do that but you have to set it to go at least a couple of levels deep and it doesn't work on every site.
     
  3. xrahat2011

    xrahat2011 Newbie

    Joined:
    Oct 10, 2019
    Messages:
    18
    Likes Received:
    1
    Gender:
    Male
    From where are you trying to scrape?
     
  4. xrahat2011

    xrahat2011 Newbie

    Joined:
    Oct 10, 2019
    Messages:
    18
    Likes Received:
    1
    Gender:
    Male
    A quick python script would be your way to go!
     
  5. Darren Amato

    Darren Amato Newbie

    Joined:
    Oct 10, 2019
    Messages:
    5
    Likes Received:
    0
    Gender:
    Male
    That's the hard part. We are a T-Shirt printer. So our customers are all over the board, because anyone can use a T-Shirt. One day we may want to target Marching Band directors, another day we may want to target High School Baseball Coaches. I need something flexibile that will allow me to essentially use a very detailed Google search and then crawl those pages and get the names / emails of the people that I want.
     
  6. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    5,073
    Likes Received:
    2,730
    Gender:
    Male
    Home Page:
    With respect scrapebox is good enough and can do this multiple ways, the question is this "is the approach your using towards scrapebox good enough to maximize what the tool can do?"

    Again with respect.

    So the answer is that you can use the crawl function in scrapebox in the basic email scraper or the email scraper premium plugin. It will crawl thru the next links and then you can use it to crawl the sites as well. You could do it in literally at least 3 different ways and accomplish what you want.

    However probably the easiest is to simply look at the url structure of all those pages with the next links. In the vast majority of cases each page of links simply has an incrementally increasing number like

    http://www.domain.com/results_page_number=1
    http://www.domain.com/results_page_number=2
    http://www.domain.com/results_page_number=3
    http://www.domain.com/results_page_number=4
    http://www.domain.com/results_page_number=5

    The numbers might increase by 1 or 10 or 14 or something but its typically consistent, just like at the url of the first few pages.

    Then in the email scraper plugin is a generate urls tab. Just use that and you can fabricate out 10 million pages worth of urls, if you want. Or maybe just click last page and see how many there are and go from there. ;)

    Then from there it varies how your source domain is setup, but probably you want to toss them in the link extractor addon and extract external links, assuming the external links are listed directly on the page your loading, the urls you fabricated I mean.

    Then once you have that external link you can load those in the email scraper.



    If the websites are not listed directly on teh fabricated urls you may need to use the link extractor to extract internal pages of that source domain and then extract external and then scrape mails. Again its a little dependant on the source site, but once you have a basic understanding of the tools, you can make it work for any site.


    but if you can't sort it, if you can post or pm me specific examples then I can give you specific directions to help you learn.

    There are other ways tool, to achieve this in scrapebox, thats the beauty of having such a diverse set of tools. Most of the time its not a question of if it can be done, is a question of how many different ways can it be achieved and what makes the most sense or is the shortest route to the end goal.

    Cheers!
     
  7. Darren Amato

    Darren Amato Newbie

    Joined:
    Oct 10, 2019
    Messages:
    5
    Likes Received:
    0
    Gender:
    Male

    Thank you for the detailed response, and I take no offense to your points, I'll readily admit that I've struggled with ScrapeBox because it is "So Capable" I don't fully understand the tool and how to use it. And on top of that, it regularly crashes when compiling large runs, and I end up losing work. So I always assumed it was not stable. That said, the developer has been uber helpful and thinks I have a damaged system DLL, and I've been reluctant to rebuild my whole PC, we've all be there right? That said ... Here is my specific question.

    I am looking for contacts associated with Color Guard organisations. The site is here: https://wgi.org/color-guard/search-groups/

    This site is structured in a way where you have to click the "Next" button to get to the next screen. Some of the screens have contacts, others have web site addresses and others have Facebook Pages.

    Ideally, I would like ScrapeBox to be able to just collect any emails associated with those links.

    In the past, I haven't been able to contain ScrapeBox to just the finite set of sides associated with the launch site. In this case, the website noted above, and any links from there to other sites. Scrape Box seems to want to go everywhere. Even if I choose 2 deep, it seems to not work the way I expect.

    If I could start with https://wgi.org/color-guard/search-groups/ and capture any email associated with the URL for that email, that would serve my needs.

    Also, beyond the videos on ScrapeBox that are plentyful from the company itself, if there are other learning resources that may be more basic and simple, I would like to know about them. I feel like ScrapeBox is the swiss army knife of Black and White Hat SEO, but it's so capable and so complex I don't understand how to use it.
     
  8. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    5,073
    Likes Received:
    2,730
    Gender:
    Male
    Home Page:
    For crashing if you can mail me more specific data I can tell you why it crashes and give you direction on how to make it not.

    Are you on mac or windows?

    I hate rebuilding my machine, its weeks of getting things back to where they are. I take a daily, weekly and monthly image of my machine as well to a local drive just in case. Thats nice as its a 10 minute restore of everything. I also have literally backups of my backups of my important files. Had too much loss in the past. Its actually been like 2 years since I had to rebuild it and I hope it lasts till I get my next new laptop, but not lookin forward to that!

    When you choose 2 levels deep it will literally follow any link it finds. Basically a level may not seem like what you think, but its built in a way to totally optimize and crawl every link.

    I assume your talking about the videos here:
    https://www.youtube.com/user/looplinescrapebox/videos

    ?

    If so thats pretty much it.
     
  9. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    5,073
    Likes Received:
    2,730
    Gender:
    Male
    Home Page:
    On your url, see how the pages have a structure

    https://wgi.org/color-guard/search-groups/?cpage=2
    https://wgi.org/color-guard/search-groups/?cpage=3

    so you can just fabricate all the urls with the numbers on the end.

    like

    https://wgi.org/color-guard/search-groups/?cpage=4
    https://wgi.org/color-guard/search-groups/?cpage=5
    https://wgi.org/color-guard/search-groups/?cpage=6
    etc.. The generate tab in the email scraper plugin can do this.



    Then I would generate all those urls. It goes up to 397

    Then after that I would use the link extractor, extract all external links. Then load those external links into email scraper and you should be good.

    Also you could put the 397 fabricated urls into the email scraper and have it only scrape the exact page and not crawl.
     
  10. Darren Amato

    Darren Amato Newbie

    Joined:
    Oct 10, 2019
    Messages:
    5
    Likes Received:
    0
    Gender:
    Male
    100% of the time that the crash occurs, it happens when I'm harvesting emails from URLs. I can an error message that says

    "Access violation at address 75FFBC10 in module 'gdi32full.dll. Read of address 0FE90000."
     
  11. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    5,073
    Likes Received:
    2,730
    Gender:
    Male
    Home Page:
    If its the normal email scraper in your scrapebox folder is a bugreport.txt file.

    If its the email scraper plugin the bug report file is in the scrapebox folder \ plugins \ email scraper folder.

    send that to support Ill pm you the email