Are you having problems scraping Google?

playboy

Regular Member
Joined
Mar 7, 2010
Messages
266
Reaction score
56
Maybe it is just me, but do you have problems scraping Google with proxies?

I am using proxies, but the proxies don't seem to be working very well anymore(and they are private proxies). After a certain amount of usage(usually after a few minutes) they stop working. Used to be I could scrape a lot more on Google and get much better results.

Anyone having the same problems? Or is it just me?
 
Google may have recently changed their search engines to recognize certain search queries as 'robotic'. Google will then block specific, robotic looking requests from your proxy based on the rationale that the search query is too advanced for a human to come up with.
 
My search queries are pretty advanced (compared to a "normal" googler) and I'm having no issues scraping today - without proxies, so I'd guess your problem is from either bad proxies or scraping too quickly

RN
 
1 - if u scrape with your own ip your likely to
get banned from google

2 - the reason your private proxies are getting blocked
from scraping is an easy 1. after so many requests from
the same proxie google blocks it. its that simple.

anyone who is having bad results with public proxies
either has bad sources or a bad supplier using bad sources.
ive supplied public proxies for google scraping for 2 yrs here
on bhw with 450 signups 0 refunds.

using public proxies to scrape isnt about what u have
its about how many others have it as well wich defines
how well it works... simple
 
Don't use public proxies like mentioned above. Also slow down your scraping. I'm noticing I get blocked quite a bit more now due to too many requests.
 
Don't use public proxies like mentioned above. Also slow down your scraping. I'm noticing I get blocked quite a bit more now due to too many requests.

then explain Jericho why i can support 440 users of
bhw mostly donors/vips/execs/4mods/1admin/ and never
a complaint once - the simple answer is my sources are
better than what your getting.. i get aprox 2-3k google
proxies a day and i no u dont wanna no why?
because theres only so many public proxies u can scrape
but theres thousands of public proxies u can PORT SCAN
that scraping wont find, thats why i can support 450+
users because i guarentee im getting blocks of public proxies
via port scanning u cant scrape. FACT .
the 2k+ google passed proxies u post for my subs speed 1-3000 ms
 
I'm working on this problem this morning. My code/bots now high speed scrape, 10 proxies, 200 tries then delay 15 minutes and do it again. Google seems to be banning after 20 tries on the same proxy.
 
google will ban repeat scrape searches on the same proxie
after so many trys this is why using private proxies wont
work as most people only have 10 - public proxies will work
provided the sources you use arnt pooled by thousands of
other people making them worthless
 
I've experienced the same thing with my private proxies. It seems as of late Google has gotten a bit more aggressive blocking some of the more advanced queries.

As others have said, if you make too many requests in too short an amount of time Google will block you. You'll either need to slow down the number of requests, add more private proxies or find quality sources of public proxies.

In my case, I just beefed up the number of private proxies but I use a lot of them (500) which I doubt is a normal number for most people.
 
I am also having issues with scraping google. I'm running 500 pretty good proxies, and have connections set to 25. Google has 300 results and bing/yahoo are up to 250k results each with 25 connections also. It's still going, though...hasn't stopped harvesting yet.
 
The big G is getting better at detecting people/proxies that are scraping, and will usually temporary ban the IP address(es).

I use private proxies to scrape and once those are temp banned, I use my own IP. Eventually all of those IP's are banned for a few hours, but I could care less because I usually always get what I need the first time around.
 
How many queries you are able to preform from one IP/proxy before you get banned? In my tests I got banned always after about 260 queries..
 
Back
Top