1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Are you having problems scraping Google?

Discussion in 'Proxies' started by playboy, Nov 24, 2011.

  1. playboy

    playboy Regular Member

    Joined:
    Mar 7, 2010
    Messages:
    266
    Likes Received:
    55
    Location:
    Bangkok
    Maybe it is just me, but do you have problems scraping Google with proxies?

    I am using proxies, but the proxies don't seem to be working very well anymore(and they are private proxies). After a certain amount of usage(usually after a few minutes) they stop working. Used to be I could scrape a lot more on Google and get much better results.

    Anyone having the same problems? Or is it just me?
     
  2. Jrim_Software

    Jrim_Software Power Member

    Joined:
    Aug 1, 2011
    Messages:
    775
    Likes Received:
    180
    Home Page:
    Google may have recently changed their search engines to recognize certain search queries as 'robotic'. Google will then block specific, robotic looking requests from your proxy based on the rationale that the search query is too advanced for a human to come up with.
     
  3. sknight

    sknight Newbie

    Joined:
    Nov 26, 2010
    Messages:
    32
    Likes Received:
    11
    I'm having this problem too.
     
  4. RandomName

    RandomName Junior Member

    Joined:
    Apr 14, 2009
    Messages:
    105
    Likes Received:
    74
    My search queries are pretty advanced (compared to a "normal" googler) and I'm having no issues scraping today - without proxies, so I'd guess your problem is from either bad proxies or scraping too quickly

    RN
     
  5. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    10,598
    Likes Received:
    8,818
    1 - if u scrape with your own ip your likely to
    get banned from google

    2 - the reason your private proxies are getting blocked
    from scraping is an easy 1. after so many requests from
    the same proxie google blocks it. its that simple.

    anyone who is having bad results with public proxies
    either has bad sources or a bad supplier using bad sources.
    ive supplied public proxies for google scraping for 2 yrs here
    on bhw with 450 signups 0 refunds.

    using public proxies to scrape isnt about what u have
    its about how many others have it as well wich defines
    how well it works... simple
     
  6. -Jericho-

    -Jericho- Jr. Executive VIP

    Joined:
    Jan 10, 2010
    Messages:
    2,849
    Likes Received:
    1,704
    Location:
    Stalking My Ex-Wife
    Don't use public proxies like mentioned above. Also slow down your scraping. I'm noticing I get blocked quite a bit more now due to too many requests.
     
  7. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    10,598
    Likes Received:
    8,818
    then explain Jericho why i can support 440 users of
    bhw mostly donors/vips/execs/4mods/1admin/ and never
    a complaint once - the simple answer is my sources are
    better than what your getting.. i get aprox 2-3k google
    proxies a day and i no u dont wanna no why?
    because theres only so many public proxies u can scrape
    but theres thousands of public proxies u can PORT SCAN
    that scraping wont find, thats why i can support 450+
    users because i guarentee im getting blocks of public proxies
    via port scanning u cant scrape. FACT .
    the 2k+ google passed proxies u post for my subs speed 1-3000 ms
     
  8. sknight

    sknight Newbie

    Joined:
    Nov 26, 2010
    Messages:
    32
    Likes Received:
    11
    I'm working on this problem this morning. My code/bots now high speed scrape, 10 proxies, 200 tries then delay 15 minutes and do it again. Google seems to be banning after 20 tries on the same proxy.
     
  9. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    10,598
    Likes Received:
    8,818
    google will ban repeat scrape searches on the same proxie
    after so many trys this is why using private proxies wont
    work as most people only have 10 - public proxies will work
    provided the sources you use arnt pooled by thousands of
    other people making them worthless
     
  10. jkwilson78

    jkwilson78 Regular Member Premium Member

    Joined:
    Jun 24, 2010
    Messages:
    224
    Likes Received:
    311
    I've experienced the same thing with my private proxies. It seems as of late Google has gotten a bit more aggressive blocking some of the more advanced queries.

    As others have said, if you make too many requests in too short an amount of time Google will block you. You'll either need to slow down the number of requests, add more private proxies or find quality sources of public proxies.

    In my case, I just beefed up the number of private proxies but I use a lot of them (500) which I doubt is a normal number for most people.
     
  11. Sauronws6

    Sauronws6 Newbie

    Joined:
    Nov 9, 2011
    Messages:
    3
    Likes Received:
    0
    I am also having issues with scraping google. I'm running 500 pretty good proxies, and have connections set to 25. Google has 300 results and bing/yahoo are up to 250k results each with 25 connections also. It's still going, though...hasn't stopped harvesting yet.
     
  12. HFlame7

    HFlame7 Regular Member

    Joined:
    Jun 20, 2011
    Messages:
    277
    Likes Received:
    156
    The big G is getting better at detecting people/proxies that are scraping, and will usually temporary ban the IP address(es).

    I use private proxies to scrape and once those are temp banned, I use my own IP. Eventually all of those IP's are banned for a few hours, but I could care less because I usually always get what I need the first time around.
     
  13. ankesen

    ankesen Newbie

    Joined:
    Dec 6, 2011
    Messages:
    15
    Likes Received:
    4
    How many queries you are able to preform from one IP/proxy before you get banned? In my tests I got banned always after about 260 queries..