1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

ScrapeBox Whois - is 17% success rate the best I can hope for?

Discussion in 'Black Hat SEO Tools' started by PickleYeast, Dec 28, 2016.

  1. PickleYeast

    PickleYeast Registered Member

    Joined:
    Dec 19, 2016
    Messages:
    51
    Likes Received:
    4
    Gender:
    Female
    Ok, I've now spent about 20 hours working with the Scrapebox whois. I've played with large sets of proxies harvested by scrapebox itself, I've used paid for premium proxies, I've tried various delay settings and retry settings. I'm only looking up .COM, and my internet connection is good: 12mbps Up/115 Mbps down.

    Conclusions:
    1) No matter what the settings are, with good proxies the success rate is always near 17%. The rest are "completed, no data", whatever that means.
    2) If you don't close and re-start the WHOIS add-on every few hours it will crash and lose your collected data.
    3) There is a bug that prevents it from opening URL files at times, claiming every file you try to open is "in use by another program" when it's not. Close whois and restart and the bug goes away.

    Has anyone found a way to improve upon 17% success?
     
  2. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,726
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    completed no data means there is no data that scrapebox can scrape. You can log the responses for a few of these domains and view the responses.

    Its either that there is actually no data or the whois server data has changed and scrapebox can't find the markers it needs. In which case you can edit it and/or pass it to support so they can update it for everyone.

    I get near 100% success rates, at least as what I define as success which is completion.

    I can tell you scraped public proxies will be somewhere between terrible and completely terribly awful, so don't even bother with those.

    Paid proxies would be better.

    I don't have any issue with files being in use. Ive actually never encountered that, it could be something on your machine not releasing the files. Try restarting your entire machine.

    How big of url sets are you working with? And your using the 64bit version or the 32bit?

    If you go into your scrapebox folder >> addons >> whois addon >> bugreport.txt file - and send that scrapeboxhelp (at] gmail *dot) com

    they can have a look at tell you what is causing the crash.
     
  3. PickleYeast

    PickleYeast Registered Member

    Joined:
    Dec 19, 2016
    Messages:
    51
    Likes Received:
    4
    Gender:
    Female
    First of all, THANK YOU for the detailed reply!!

    I'm using the 64 bit version. I will start sending my bugreport files as you suggest - thanks for the tip!

    I pretty randomly picked the company to buy the paid socks proxies from -- and I paid $12 for 7 days of access, which ends tomorrow. Perhaps that's just too cheap for good proxies? Could you suggest a supplier, in PM if you prefer? As I mentioned, my success rate seems to be the same with scrapebox harvested or paid proxies.

    As I mentioned, closing the restarting the WHOIS program stops the bug, so I know it's not an issue with my machine. I can reproduce the error 100% of the time by trying to re-import the same URL file. You'll get the error on that file, and on EVERY file you try to open after that until you close the program.

    I do have it save the entire response to WHOIS scrapes, and nearly all of the 'completed, no data' files are zero length. That is what happens over 80% of the domains in the file. I keep running the file over and over to get more responses, and I always eventually get the data, so I'm not sure if it's an issue with the whois server, the proxy, or Scrapebox. That's why I was saying in the other thread that I wish there were an option to re-process the domain list for those that failed, since I have to do that manually and close the program each time to avoid the file open bug.
     
  4. living2xl

    living2xl Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2011
    Messages:
    1,596
    Likes Received:
    306
    Occupation:
    Sippin dat juice - Shout it louder!
    Location:
    Not sleeping!
    Home Page:
    git gud proxies boi
     
  5. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,726
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Your welcome for the response. :)

    On proxies, its possible they are not ideal for the whois addon. You also no longer need socks proxies for the whois addon. These are the places I use/have used and recommend
    http://scrapeboxfaq.com/scrapebox-proxies

    Does the file opening issue happen when you load the same list or any list?

    If the response for a given url is empty and shows no data and you rerun that url as a test and get the data then something has stopped the connection from completing with all the data. It might be that some security software is hijacking some of the threads or that the proxies are failing etc..

    So make sure you whitelist the addon in all anti-virus, malware checkers and firewalls. Noting that disabling security software often only stops new rules from forming, but existing rules still fire. So you need to fully whitelist. The whois addon does use port 43.
     
  6. PickleYeast

    PickleYeast Registered Member

    Joined:
    Dec 19, 2016
    Messages:
    51
    Likes Received:
    4
    Gender:
    Female
    This deserves a follow-up, since I spoke ill of a product, and I was wrong.

    I took loopline's advice and bought good proxies, and scrapebox has proven to be extremely reliable, with errors less than 1% of the time, and I think many of those are due to problems on the server side. It does fully crash on occasion, but that's relatively rare.

    I've given up trying to configure the WHOIS addon to save more data, so I just have it save the entire response and I wrote my own WHOIS parse to read those files and load the database. That was no easy task given the total lack of standard format and lots of garbage characters (including carriage returns) in the data itself.

    I would like to use scrapebox to simple visit pages (using my proxies) and save the page as an html file (as the WHOIS process does), but there seems to be no way to get it to do that. Now that I've written a parser, I want to do my own parsing in other types of pages. Has anyone done that before?
     
    • Thanks Thanks x 1
  7. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,726
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    httrack will do what you want, I think. It will save entire pages anyways, but no scrapebox will not do it. Glad that some good proxies helped out.

    Yes the whois addon was built with saving only enough info to do contact marketing if you will, and yes there is no standard formatting which is a pain. Scrapebox spent many a day building the whois addon, making sure it doesn't exceed the limits set by each server, building the parsing etc... Glad you got what you wanted though.