1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Having a powerful dedicated server!! but scrapebox harvesting 143 urls/second? what the ??

Discussion in 'Black Hat SEO' started by xrumerexpert, Sep 16, 2010.

  1. xrumerexpert

    xrumerexpert Newbie

    Joined:
    Sep 2, 2010
    Messages:
    25
    Likes Received:
    0
    Hi,

    I've got my scrapebox installed on a very powerful dedicated server with 100mbps dedicated connection...

    When i set threads to 500, and use harvested proxies from scrapebox(415) the harvesting speed is 143 url's/second???? It's slower than my pc...

    I thought i'd get something like 3000 url's/ second atleast??

    Please help:)

    Thanks in advance:))
     
  2. hitman247

    hitman247 Executive VIP Premium Member

    Joined:
    Oct 12, 2008
    Messages:
    739
    Likes Received:
    1,851
    Occupation:
    Full Time IM
    Location:
    Your Six O'clock
    Home Page:
    I don't see anything wrong with 143 url's/sec....especially on shared public proxies.
     
  3. CyrusVirus

    CyrusVirus BANNED BANNED Premium Member

    Joined:
    Aug 20, 2009
    Messages:
    1,110
    Likes Received:
    686
    its only going to be as powerful as your proxies.. meaning you need to get some private proxies
     
  4. lewi

    lewi Jr. VIP Jr. VIP Premium Member

    Joined:
    Aug 5, 2008
    Messages:
    2,309
    Likes Received:
    818
    I get around the same speed using the public proxies!

    Way to get around it is to run more than one instance of scrapebox (make sure to name them so you don't get confused).

    I know some people who run 10+ at a time!

    Personally i only run a max of 3!

    Lewi
     
    • Thanks Thanks x 1
  5. magpie2419

    magpie2419 Regular Member

    Joined:
    Mar 27, 2009
    Messages:
    313
    Likes Received:
    343
    Occupation:
    Millionaire
    Location:
    Newcastle UK Or Morocco
    If you have a dedicated box did it not come with IPs already, mine came with 6 IPs. I think 143 URLs per second is good = more than 1/2 Million in a hour
     
  6. xrumerexpert

    xrumerexpert Newbie

    Joined:
    Sep 2, 2010
    Messages:
    25
    Likes Received:
    0
    yes , it came with 5 ips...But how to use it???

    Thanks:))

     
  7. crazyflx

    crazyflx Elite Member

    Joined:
    Nov 9, 2009
    Messages:
    1,674
    Likes Received:
    4,825
    Location:
    http://CRAZYFLX.COM
    Home Page:
    There is no point in using 500 threads with that few number of public proxies. If I read your OP correctly, you're using 415 public proxies.

    I GUARANTEE you, that of those 415 public proxies, not all of them are going to be working & certainly only about one tenth of them are going to have a latency of under 1000ms.

    Lets say that you're incredibly lucky and 400 of those 415 public proxies are working. Now lets say you're SUPER incredibly lucky and those 400 proxies have a latency of 3000ms (that latency is tested using a 1kb packet).

    First, lets take a look at connections:

    500 connections using 400 public proxies (if all are working...which I can assure you they aren't). That means that 100 of those public proxies are being forced to take two connections at the same time.

    Now, latency. Using a 1kb packet, your proxies have a latency of 3000ms. Now you're trying to gather ALLLLLLL this data...you're REALLY putting a strain on these proxies...imagine how many other people are using them.

    What's likely happening, is SB is spending more time looking for a viable proxy to use than it is actually scraping URLs.

    I can scrape at 1,500 URLs a second on my dedi using less proxies & less connections....less than a fifth of what you're using....know how?

    By using private proxies....and lots of them.

    Now, if you don't want to have to put out that kind of cash but want to scrape at much higher speeds, you need make sure you're using only the best of the best public proxies.

    You do this by repeat filtering...lots of repeat filtering.

    Go to SB's proxy tester, remove duplicates and test your proxies. Remove the dead ones. Now, remove proxies that have a latency of over 10 seconds (10,000ms).

    Test them again. Remove dead ones. Remove latency of over 10 seconds.

    Test them again. Remove dead ones. Now remove latency of over 5 seconds.

    Test them again. Remove dead ones. Remove latency of over 2 seconds.

    Test them again. Remove dead ones.

    Test them again. Remove dead ones.

    Do this until you get ZERO dead proxies (it will happen....eventually).

    Now, remove proxies that have a latency of over 2000ms.

    If you're only getting your public proxies from SB, than you're going to have about 50 proxies left after all of this.

    Using 15 connections and those 50 proxies, you'll harvest at almost 300 URLs per second.

    Twice the speed as what you're currently getting & with only a quarter of what you're settings are currently at.

    You can't make a chevy cavalier perform like a ferrari, no matter how hard you push on the gas pedal.

    You also can't make SB perform faster & better just by cranking up the settings. You have to make sure you're giving it the best "parts" before you even think about laying on the "gas".
     
    • Thanks Thanks x 6
  8. xrumerexpert

    xrumerexpert Newbie

    Joined:
    Sep 2, 2010
    Messages:
    25
    Likes Received:
    0
    Excellent Tutorial:))

    Really a great thanks:)



     
  9. bleach

    bleach Senior Member

    Joined:
    Oct 12, 2008
    Messages:
    934
    Likes Received:
    82
    Location:
    New York
    LOL
    iam a fans of crazyfix :pcguru:
     
  10. xiphre

    xiphre Regular Member

    Joined:
    Jun 9, 2007
    Messages:
    290
    Likes Received:
    84
    Location:
    EU
    Possibly it could help if you set up your own custom nameservers aswell. Who are you hosting with and where is there DC located? There really is no guarantee a dedicated will be faster than your home pc considering the location, their bandwidth providers and so on.