1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

SB harvesting question

Discussion in 'Black Hat SEO Tools' started by newestguy, Sep 3, 2012.

  1. newestguy

    newestguy Registered Member

    Joined:
    Aug 13, 2012
    Messages:
    66
    Likes Received:
    3
    This seems to be happening a lot to me lately, I scrape fresh proxies they are all tested I start harvesting and it pulls from google for maybe a minute or two and then hangs there. Yahoo and the rest keep harvesting but the proxy that was apparently banned by google just sits there. Is there a setting I can check that will cause it to switch the proxy it is using when it hangs up?
     
  2. jschmidli

    jschmidli Junior Member Premium Member

    Joined:
    Jun 24, 2010
    Messages:
    107
    Likes Received:
    40
    What's your thread to proxy ratio? You should only have a small percentage of the total proxies for your thread count. That way you are not hitting the same proxy ip to much. Also google has tightened their maximum allowances for scraping lately and if one ip does to much they will block the whole subset for a few hours.

    Best bet is to get some dedicated proxies and treat them nicely so they will last.
     
  3. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    10,262
    Likes Received:
    8,710
    yup its true google has shorted the amount of requests
    a proxy can make b4 they block it, there tightening the reigns
    there was a time if you chose to scrape with private proxies
    10 was enough - not anymore - google is getting way to strict
    you either need quite a few shared proxies
    or a bunch or public proxies shared between few sources
    so they last longer - they are your choices
     
    Last edited: Sep 3, 2012
  4. newestguy

    newestguy Registered Member

    Joined:
    Aug 13, 2012
    Messages:
    66
    Likes Received:
    3
    so if I scrape up between 100-200 public proxies how many threads should I set the limit at?

    I use private proxies for posting, but I was under the impression if I tried to use my small amount of private proxies for scraping they would quickly get banned.
     
  5. jschmidli

    jschmidli Junior Member Premium Member

    Joined:
    Jun 24, 2010
    Messages:
    107
    Likes Received:
    40
    I would suggest no more than 20% threads. So if you had 10 proxies use 2 threads.

    Also if you do any advanced parameters such as inurl: you will be banned a lot quicker.
     
  6. newestguy

    newestguy Registered Member

    Joined:
    Aug 13, 2012
    Messages:
    66
    Likes Received:
    3
    holy crap.... I am going to need a lot more proxies. I have come up with some decent scraping sources, but I am really topping out around 200 and about half of them die quickly. So I guess i should be limiting the threads to 20. Lets just say that was probably my problem :(
     
  7. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    These days it's impossible with Scrapebox in any respectable numbers. Check out hrefer instead.