Hi loopine. Thanks for your answer.
The company I'm setting SB for sends 4 million voice mails daily.
We have a 24gb ram machine with 3.70 GHz (2 processors CPU)
What would be the maximin amount of scraping threads we could achieve with this?
What proxies would you use? Rotating Ones?
Could you recommend a provider and their package?
If you were to use Scrapebox to its maximum:
Where would you get the VM
How many proxies would you use and from where?
To how many threads would you set it to?
Thanks a lot loopline.
Mihailo
I can't say for sure the max scraping threads, but you could probably get near 2000 give or take. Windows can handle more then 200, sometimes, but sometimes it starts to have issues when you exceed that. If you want to scale its ideal to do as google does. Many smaller machines rather the fewer larger ones.
IF you have the machine already, max it out and then go from there. But if adding more you may find it better to do multiple smaller VPS of the 8GB to 12GB RAM.
Its not really possible to say for sure how many threads you can get, but its easy enough to find out, you just keep pushing the limit until it breaks and back it off some and then you know. You can run an unlimited number of instances of scrapebox on a single machine at any case, so no worries there.
rotating proxies are generally slow, as they are being used by multiple people, the source could be a generic public proxy thats in the pool, or slower sources. If you want speed, I would just use private proxies or shared private proxies. That way its data center based, you get speed and still have lots of ips.
Unless your hitting a LOT of urls from a few domains, then quantity of proxies will only matter in relation to total volume/speed, and not quantity of ips. Google is not going to see the websites your scraping from anyway.
I assume you already have a list of urls/domains and you just need to scrape the phone numbers from them?
provider, I would use my private proxy, Ill pm you a discount code and list of providers.
the machine I would get from solid SEO VPS, thats where I have used for years, but any place can work, just they are aware of scrapebox and all good.
how many proxies I would use, thats a good question. Ideally you want to use 10 connections per proxy. Some times you can use more, but its a good place to start. So 2000 connections is 200 proxies. etc.. So I would pick a number, say 150, 200 whatever, and try it. Then if all is well you can always add more proxies.
I would aim to get to 2000 threads, but you will need multiple instances.
At the end of the day, the name of the game here is push the envelop of the server and see where it chokes, then back off and your good. Thats how I have always done it anyway, its kind of like push every button, see what happens and then you know how stuff works.
Mind you that breaks stuff sometimes, and sometimes you have to clean up a mess, LOL But I am a limit pusher so its what I do.