1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox connections failure

Discussion in 'Black Hat SEO' started by hillbilly181, Oct 20, 2011.

  1. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3
    How do I get scrapebox to not drop to 0 connections before it hits even 50k results?

    I have private proxies - no go.
    I subscribed to a public proxy list, 2000 working a day and $60 a month - still no go.

    No matter what I do, what I set connections to, timeouts to, proxies I use, I get it down to 0 connections real quick.

    The thing is though, that there's no way scrapebox failed on all 2k working proxies. I even go into yahoo and type in searches with that proxy and it works fine. But scrapebox fails and drops connections to 0, even when there are still are working proxies to be used (many , many)

    Is there a setting somewhere in there that I'm missing? an addon? I dont know how you can harvest urls without scrapebox connections dying out quick. It's frustrating.

    On yahoo I get error 999, but it hasn't even tried 90% of the damn proxies I have.
     
    Last edited: Oct 20, 2011
  2. xbox360gurl70s

    xbox360gurl70s Elite Member

    Joined:
    Sep 28, 2008
    Messages:
    1,532
    Likes Received:
    349
    Location:
    In your wet dreams
    it could be your internet connection capping your scrape. the more reason to go for a VPS
     
  3. mrpega

    mrpega Regular Member

    Joined:
    Sep 19, 2008
    Messages:
    352
    Likes Received:
    88
    isit me or recently vps has gained much of the spotlight when discussing scrapebox? :)
     
  4. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3

    Really? Why would the internet care if the bandwidth is going towards a scrape?
     
  5. Moayediyan

    Moayediyan Regular Member

    Joined:
    Oct 29, 2008
    Messages:
    239
    Likes Received:
    88
    It's because you don't have to deal with shit like this. I run mine also on a VPS. Couldn't be any happier
     
  6. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3
    Would a VPS definitely solve my problem? Ill get one in a heartbeat if so. Im just sick of spending money on pointless things. I bought that dumb proxy list for $60 thinking the problems were my proxies, like everyone else was saying. Go me for being a naive noob
     
  7. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3
    Just went out and bought 8gb of ram, still a no go. I sware it has nothing to do with my computer. are there any settings i should tick? i got my connections, my timeouts, tried everything, scrapebox wont go more than 5k results without freezing up
     
  8. Cnotey

    Cnotey Power Member

    Joined:
    Jun 25, 2010
    Messages:
    707
    Likes Received:
    912
    Location:
    Seattle
    Home Page:
    I just posted a thread about this the other day. I was having the same problem as well, but I fixed it. I guarantee you it's not your proxies. I turned off multi-threading harvesting, and just watched the results to see what the problem was. Problem solved.

    I just stopped using Google to scrape URL's and started using Bing, and now it works like a charm. Scraped 5,000,000 urls so far today.
     
    • Thanks Thanks x 1
  9. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3
    Bing! I gotta try that. I actually have a few times but I got lazy because of the API stuff it makes you do. I'm gonna try it now. thanks for the awesome ti
     
  10. dowser

    dowser Power Member

    Joined:
    Jun 5, 2011
    Messages:
    685
    Likes Received:
    122
    Location:
    canada
    I think you miss the boat somewhere! I scrape only g-oog-le (to get it indexed right away) and use on average less than 200 public proxies and get 20-50k urls per session (I like smaller runs as I have only 1gb ram on my vps).

    I suspect it has something to do with the keywords and footprints you use.
     
  11. hillbilly181

    hillbilly181 Newbie

    Joined:
    Oct 3, 2011
    Messages:
    35
    Likes Received:
    3
    Bing is the fix guys. (went to 37k w/out proxies)
    Turn off multi-thread harvesting and hit up Bing. You'll be golden.
     
    Last edited: Oct 21, 2011
  12. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,383
    Likes Received:
    1,801
    Gender:
    Male
    Home Page:
    No a VPS would not solve your issue, likely not anyway and while its good to have more RAM then not, 8GB is not needed.


    There are many things to consider basically. Only using Bing is an option, but your basically avoiding the issue and skirting around it.

    You can go to settings >> adjust multi-threaded harvester proxy retires. Run it up to 20. What is most likely happening is that your connections are turned up so high that Google is banning your IP. Your proxy list you bought, many other people are getting the IPs banned quickly as well.

    Running up the proxy retries controls how many proxies scrapebox will retry before skipping the keyword. You wouldn't want it to default to trying all 2000 proxies on each keyword or it would take a century to get anything done. The point of the multi threaded harvester is to rip as many results from the engines as quickly as possible. If you are after accuracy then go for the single threaded harvester.

    As for troubleshooting the problem, as mentioned above, go to settings >> use multi threaded harvester - and uncheck that. Then you get a status column that gives you error codes, and you can troubleshoot from there. If you watch 9 chances out of 10 I would bet when things start going down to 0 connections it says "IP blocked" for google, aka the proxies IPs are banned.

    The 999 yahoo error means IP Blocked by yahoo.

    You can scrape Google just fine with private proxies as well, just set you connections to 20% of your proxies as a rule. So if you have 10 private proxies set it to 2 connections. Private proxies are so fast you will still get results quickly and there will be enough of a delay that in "general" you won't get your IPs banned, yahoo seems to be a bit stricter, but the rule of thumb still works well for them too for the most part.

    So there is nothing wrong with the scrapebox scraper, and it is not just dropping to 0 connections as I can scrape millions of urls from Google in 1 run, all with private proxies. I do it all the time. You just have to understand how it all works, and slow down enough to fly under the radar.

    Its great that Bing is working for you, but its not the "solution" as it were, Google and yahoo work fine, your just going to fast. Worst case turn private proxies on and use the single threaded harvester with google and yahoo and you will probably do great as well.
     
  13. Cnotey

    Cnotey Power Member

    Joined:
    Jun 25, 2010
    Messages:
    707
    Likes Received:
    912
    Location:
    Seattle
    Home Page:
    Confirmed this works as well. I should mention though that I've never had a problem with using 300+ connections for URL harvesting for the last year. It's only the last month or two I've had this issue.

    Dial your connections down to 10 for Google scraper and you will golden. I will mention however, that I get about 20urls/sec less with google than with bing. Thats an extra 100,000 urls an hour using bing.