1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

PHP CURL with proxies being banned

Discussion in 'General Programming Chat' started by tecto, Mar 5, 2014.

  1. tecto

    tecto Newbie

    Joined:
    Oct 27, 2013
    Messages:
    4
    Likes Received:
    0
    Hi folks,

    I'm developing a free SEO tool that checks PR for thousands of domains with PHP Curl. The problem is that after a while I get banned.

    The proxies I'm ussing (http google enabled ones) work great with Scrapebox: I check over 90000 domains with over 50 proxies in 3 hours. But I don't know how Scrapebox server hide this public proxies.

    Any experts in this area of knowledge?
     
  2. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,290
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
    Yes, it is normal to get banned as probably you are overusing the proxies in short time and google bans the IP

    You need to make a sleep between multiple checks from same proxies and come up with safe algo to rotate proxies for the checks
     
  3. netbull2007

    netbull2007 Newbie

    Joined:
    Mar 7, 2014
    Messages:
    8
    Likes Received:
    6
    and use some elite proxies. public don't hide you...
     
  4. divok

    divok Senior Member

    Joined:
    Jul 21, 2010
    Messages:
    1,015
    Likes Received:
    634
    Location:
    http://twitter.com/divok
    I hope you are using user agents . coz some hosts ban ips if they don't see comman user agents .
     
  5. gtownfunk

    gtownfunk Registered Member

    Joined:
    Jan 26, 2011
    Messages:
    99
    Likes Received:
    26
    Occupation:
    Software Developer
    Location:
    Austin, TX
    Home Page:
    Rotating user agents is important also. I know there are some higher end subscriptions people can get that let them block certain 'signatures' and 'fingerprints' or whatever. I had a custom User Agent once that was scraping about 30,000 websites a day and one day about 2% of them (random, not related all over the US geographically) started failing... Sounds like a small number but it was like 500 websites that started blocking it. Voila, User Agent change fixed it.

    gtownfunk