1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[HELP] Programming a simple Google Scraper

Discussion in 'Black Hat SEO Tools' started by Andi49, Mar 29, 2010.

  1. Andi49

    Andi49 Newbie

    Apr 26, 2009
    Likes Received:
    Hi all,

    I am trying to learn a bit of programming and I started a simple Google Scraper, where I just download the first search result page for a given Keyword.

    The problem that I now have is, that google blocks me after 10 or so downloads.

    Even when I use proxies I get blocked very fast.

    Does anyone have an idea what I need to do, so google dows not block my scraper?

    I even fake the UserAgent but still get blocked :(.

    Anybody an idea why that is?

    Thank you guys.
  2. nipester

    nipester Regular Member

    Feb 1, 2009
    Likes Received:
    Scrapebox and hrefer don't seem to get blocks. Figure out how they get around it.
  3. Grizzy

    Grizzy Senior Member

    Nov 11, 2008
    Likes Received:
    Simply put, your web bot is dumb. It's behavior is not typical of the user agent it is pretending to be, the big G recognizes this and sees it as suspicious bot-like behavior.

    Start with something easier then google if you just learning to scrape. Learn about DOM, javascript interpretation and how your bot needs to properly interact with web sites in order to simulate the actions of a real human being.