1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping Question

Discussion in 'Black Hat SEO Tools' started by quickavenue, Feb 26, 2016.

  1. quickavenue

    quickavenue Newbie

    Joined:
    May 29, 2015
    Messages:
    7
    Likes Received:
    0
    I have a list of 1000+ domains.

    I'd like to scrape for all emails associated with each domain

    How would I go about doing this without scraping every page of the entire site and harvesting all the emails and then filtering by emails that contain each domain?
     
  2. Seowarlords

    Seowarlords Junior Member

    Joined:
    Oct 12, 2015
    Messages:
    101
    Likes Received:
    12
    Home Page:
    You can run them through Srapebox's whois scraper or Link assistant..maybe these two would help you, but I only guesss
     
  3. quickavenue

    quickavenue Newbie

    Joined:
    May 29, 2015
    Messages:
    7
    Likes Received:
    0
    seowarlords, yeah I'm not interested in scraping the WHOIS data, I'm looking for search operators or some other method that utilize Google to hunt out the email addresses to harvest. The other alternative is GSA Email Spider to scrape the WHOLE domain but some of these domains have hundreds of thousands of pages, it would take me ages.
     
  4. extremeboy

    extremeboy Jr. VIP Jr. VIP

    Joined:
    Jul 8, 2010
    Messages:
    3,185
    Likes Received:
    667
    Occupation:
    World Best RANK Tracker SERPCloud.com
    Home Page:
    Better to try services like Email hunter.
     
  5. quickavenue

    quickavenue Newbie

    Joined:
    May 29, 2015
    Messages:
    7
    Likes Received:
    0
    That's what I'm thinking too. How do you think they're acquiring their data? A LOT of scraping? I can get the same results if I had 10x computers scraping the internet all day :D
     
  6. ChanzGrande

    ChanzGrande Elite Member

    Joined:
    Feb 16, 2008
    Messages:
    2,484
    Likes Received:
    1,172
    Occupation:
    Accountant
    Location:
    Northern Woods Counting Money
    Well, are we dealing with hypotheticals, or do you really want the data? Obviously you don't have 10x computers scraping. For something like this ... perhaps you can scrape enough of the data in the short-term with short-term solutions to generate enough leads converting to cash so you can either buy more expensive tools, or the other equipment you want to mine this data. Of course, it's always an option to simply get whatever you can from the information brokers.
     
  7. FutureProofSeo

    FutureProofSeo Senior Member

    Joined:
    Jul 3, 2013
    Messages:
    934
    Likes Received:
    459
    Gender:
    Male
    Occupation:
    Google Warrior
    Location:
    UK
    You could filter the pages down to ones that might be mentioning emails, footprints such as "@domain.com" "@gmail.com" "@outlook.com".
     
    • Thanks Thanks x 1
  8. Repulsor

    Repulsor Power Member

    Joined:
    Jun 11, 2013
    Messages:
    766
    Likes Received:
    275
    Location:
    PHP Scripting ;)
    I dont know of any tools that can do that custom function you are looking for. If you have some budget on this, you can easily get a tool done which can scrape the domains single level deep, and scrape all emails and filter if it matches to the respective domain.

    Well, if you want to go multiple levels down, the time you would need will go up exponentially.
     
  9. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP

    Joined:
    Mar 1, 2009
    Messages:
    1,825
    Likes Received:
    2,936
    Well you mentioned search operators and you can try these, they will likely get you most of the information but it will not guarantee you 100% complete data set. Also you mentioned all of the emails associated with the domain and this will only get you contact emails I suppose. Sometimes sites also use a contact form instead of an email.

    That being said try these to start:

    Code:
    site:domain.com inurl:contact
    
    Code:
    site:domain.com inurl:about
    
    You can easily do this in Scrapebox and then harvest the emails from Scrapebox as well.
     
  10. extremeboy

    extremeboy Jr. VIP Jr. VIP

    Joined:
    Jul 8, 2010
    Messages:
    3,185
    Likes Received:
    667
    Occupation:
    World Best RANK Tracker SERPCloud.com
    Home Page:
    Agreed with above mate @davers instead of setup your own system use already built system to get your works done.

    if you are going to need this type of solution for very long term and know you will have ROI on it so go a head for it else not worth it.
     
  11. sms2indo

    sms2indo Registered Member

    Joined:
    Jul 16, 2014
    Messages:
    99
    Likes Received:
    12
    Occupation:
    http://sms2indo.com
    Location:
    http://sms2indo.com
    Home Page:
    Why not use GOOGLE / BING to help your EMAIL scraping