1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

G**gle SERPs scraping - what upper limits (rate, daily reqs)?

Discussion in 'Black Hat SEO' started by UNDƎRCAT, Mar 26, 2011.

  1. UNDƎRCAT

    UNDƎRCAT Registered Member

    Joined:
    Mar 10, 2011
    Messages:
    59
    Likes Received:
    1
    Occupation:
    algorithms
    I don't need to do much of the scraping, but I really want to ensure to stay well under the limits where G**gle considers it an offense.

    Yet at the same time I want my programs to do their work fast.

    So what is your experience - what is the upper limit - per an IP address (I only will be using one) - queries per day, and queries per minute? (or similar gauges)

    I want to avoid any bans and captchas.

    :chicken_w

    (a funny bit of history: I wrote my first g**g scraper as a part of an IRC bot, some 11 years ago, to display search results right in the channel! good old days!)
     
    Last edited: Mar 26, 2011
  2. UNDƎRCAT

    UNDƎRCAT Registered Member

    Joined:
    Mar 10, 2011
    Messages:
    59
    Likes Received:
    1
    Occupation:
    algorithms
    no response yes, everybody is using scrapebox lol

    so maybe the scrapebox author could comment on this? :)

    i've done some searching around and found that g**gle implements various heuristic methods to prevent robots from scraping it - so it doesn't have to be just a purely rate based.

    the world's biggest scraping engine is preventing others from scraping it, how ironic!
     
    Last edited: Mar 26, 2011
  3. DatingChap

    DatingChap Registered Member

    Joined:
    Jan 25, 2011
    Messages:
    80
    Likes Received:
    24
    Occupation:
    dating websites
    Location:
    UK
    Don't know how to get around the capatchas - yesterday I was manually looking for some sites just on google search using
    Code:
    inurl:".edu" "blah blah blah"
    and as soon a I pressed enter I had a capatcha, then one on page 3, page 5 and "unusual activity" messages aplently. Nothing automated, just me looking for a website.
     
    • Thanks Thanks x 1
  4. UNDƎRCAT

    UNDƎRCAT Registered Member

    Joined:
    Mar 10, 2011
    Messages:
    59
    Likes Received:
    1
    Occupation:
    algorithms
    wow.. valuable info. Maybe your IP was already on their suspicious activity list..
    But maybe it was the actual search operators (inurl etc.).

    maybe they want to prevent mass scraping for SEO purposes by tools like Scrapebox etc..
     
  5. UNDƎRCAT

    UNDƎRCAT Registered Member

    Joined:
    Mar 10, 2011
    Messages:
    59
    Likes Received:
    1
    Occupation:
    algorithms
    anyone?
     
  6. google_scraper

    google_scraper Newbie

    Joined:
    Apr 20, 2011
    Messages:
    2
    Likes Received:
    0
    Would also be really interested...
     
  7. cyberzilla

    cyberzilla Elite Member Premium Member

    Joined:
    Nov 15, 2009
    Messages:
    2,204
    Likes Received:
    3,363
    Location:
    zeta reticuli
    No one knows the accurate limit! Avoid using advanced G0ogle search operators in your footprints as much as you can. Also don't use the G00gle.c0m always for scraping, switch to other country servers constantly.