1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Accessing google search programmatically

Discussion in 'General Programming Chat' started by mikkom, Dec 30, 2015.

  1. mikkom

    mikkom Newbie

    Joined:
    Nov 23, 2012
    Messages:
    22
    Likes Received:
    7
    Hello,

    Is anyone doing this? I know how easy this is but has anyone been doing this for longer time, what kind of restrictions (time, IP, requests per sec, permaban IP) have you found etc?

    Technical side of accessing google programmatically is pretty (extremely) easy and I know pretty well how to mimick browsers. The question is how long can you do this.. I'm not talking about thousands of requests per minute but perhaps thousands per day (?). Is this something you can do without google blocking you? Or do I have to use proxies?
     
  2. wblteam

    wblteam Regular Member

    Joined:
    Jul 21, 2014
    Messages:
    259
    Likes Received:
    47
    Home Page:
    I tried once and found it is possible only if you search using proxy. Changing ip programmatically is the way I guess.
     
  3. mikkom

    mikkom Newbie

    Joined:
    Nov 23, 2012
    Messages:
    22
    Likes Received:
    7
    Uhm, it should definitely be possible via any ip or local computer, I'm most interested is what kind of restritions there are, ie. how many queries are allowed per day/hour etc. and does it lead to permaban or just daily ban if you exceed those. I'm quite sure 1/2 queries per minute and ~10-30 per hour are ok but I'm interested where the line is drawn and when google notices you and if they do how long the ban is. I don't have fixed ip on my personal computer btw so permaban would lead to ban of local internet operator IP, I'm quite sure google doesn't want to do that.

    (technically it's extremely easy, just use any http library and fake some headers, I'm not sure if even that is needed - I have crawled google programmatically before)
     
  4. Bane Bentley

    Bane Bentley Junior Member

    Joined:
    Jun 13, 2013
    Messages:
    163
    Likes Received:
    30
    Location:
    BuildMyTraffic.Net
    Home Page:
    Scrapebox does this for its Google SERP scraping.

    Google has a lot of limits, and no matter how many random delays you add, your IP will eventually get banned or shown a captcha once you pass 25-30,000 results.
     
  5. mikkom

    mikkom Newbie

    Joined:
    Nov 23, 2012
    Messages:
    22
    Likes Received:
    7
    I'm sure that is the case (I know I can't outsmart google) I'm just wondering is the ban permaban or temporary and if temporary how long. Someone here should have some idea on the length although the correct way should probably be to use some cheap proxies..