1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Proxies needed to scrape PageRank?

Discussion in 'Proxies' started by el_gring0, Sep 5, 2011.

  1. el_gring0

    el_gring0 Newbie

    Joined:
    Jun 22, 2011
    Messages:
    3
    Likes Received:
    0
    Hello,

    I'm finding public proxies terribly slow to scrape PageRank (using scrapebox). I'd like to scrape somewhere over 1000 per minute (I can do this without proxies, though quickly get a timeout).

    Is that sort of speed possible with public proxies? What should I do that minimizes cost/effort to get proxies capable of this?

    Thanks
     
  2. darshan1994

    darshan1994 BANNED BANNED

    Joined:
    Oct 9, 2009
    Messages:
    654
    Likes Received:
    318
    Find google allowed proxy. Many providers on this forum that give you few hundred google clear proxy a day. If not go with 10 private proxy and harvest like 20-30k PR before temp ban and repeat.
     
  3. el_gring0

    el_gring0 Newbie

    Joined:
    Jun 22, 2011
    Messages:
    3
    Likes Received:
    0
    Also, how does having more proxies help scrapebox lookup PR faster? AFAI understand, it can only use one proxy per PageRank request, so I don't see why more would neccesarily be better.
     
  4. el_gring0

    el_gring0 Newbie

    Joined:
    Jun 22, 2011
    Messages:
    3
    Likes Received:
    0
    Thanks. Can you recommend a good source for private proxies? I find plenty by Googling but a lot of the sites look pretty sketchy.
     
  5. gbomb45

    gbomb45 Newbie

    Joined:
    Jun 5, 2011
    Messages:
    39
    Likes Received:
    2
    Type in proxy list in google or somethin
     
  6. Fathom

    Fathom Power Member

    Joined:
    Jul 1, 2011
    Messages:
    518
    Likes Received:
    282
    Location:
    Hertfordshire
    if you are adding a huge amount of url's to scrapebox it will be slow, break them down into 10k chunks and the speed should be just fine
     
  7. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    I'm not a SB user but generally if you're scraping anything using proxies then you can multithread and use one proxy per thread. So if you have 50 proxies you can run 50 threads at a time. If you try to bombard anything you're trying to scrape with 50 threads from the same IP, most services are going to block you pretty quickly.

    With my own scraping system, I have about 50 threads running 24/7 and I'll spike it up to 500-1000 if I need to do a big scrape in a hurry. However slow and steady generally wins the race when it comes to serp scraping.
     
  8. NIXMY

    NIXMY Regular Member Premium Member

    Joined:
    Sep 26, 2010
    Messages:
    481
    Likes Received:
    321
    Location:
    myproxylists.com
    Home Page:
    This is the reason I work on linux. It's typical to windows tools when you add more than normal amounts, they either crash, get slow and/or unreliable.

    Does any1 have idea how many PR request you can do with one IP before it will be banned temporarily?
     
  9. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    I use shared proxies from http://buyproxies.org/ for this and fallback on dedicated proxies from them and then fallback on public proxies and then fall back on direct connection. I fall back a lot as you see :)

    They work well and are cheap but obviously are not free.
     
  10. mmBob

    mmBob Newbie

    Joined:
    Feb 16, 2009
    Messages:
    31
    Likes Received:
    6
    Alternatively you could use the easy api (theeasyapi dot com). They have a PR scraping service.
     
  11. NIXMY

    NIXMY Regular Member Premium Member

    Joined:
    Sep 26, 2010
    Messages:
    481
    Likes Received:
    321
    Location:
    myproxylists.com
    Home Page:
    Thanks for the idea. Look like I must do some testing on PR scraping to see how fast a single IP gets banned.