1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping of google

Discussion in 'Black Hat SEO Tools' started by YSL, Nov 6, 2009.

  1. YSL

    YSL Regular Member

    Joined:
    Dec 27, 2007
    Messages:
    379
    Likes Received:
    1,132
    Anyone have issues harvesting urls from google recently?
     
  2. madblacker

    madblacker Regular Member

    Joined:
    Nov 2, 2009
    Messages:
    268
    Likes Received:
    19
    how could you have an issue with this? They can't do anything to prevent it once the page has loaded, you should never use a bot to click on search results, if thats what you're doing as they catch on pretty fast but just harvesting... as long as you are setting appropriate delays between pressing "next" it should be ok but you need to let it wait a while in between, I'm not sure on the amount of time but think of how fast a person would go through a page of results so it has to be reasonable.
     
  3. oldenstylehats

    oldenstylehats Elite Member Premium Member

    Joined:
    Apr 10, 2008
    Messages:
    1,893
    Likes Received:
    1,196
    Unless you're doing massive scraping runs, you ought to go through their search API. Much more convenient than conventional scraping, IMHO.
     
  4. justone

    justone Elite Member

    Joined:
    Oct 12, 2008
    Messages:
    1,516
    Likes Received:
    1,037
    Occupation:
    -
    Location:
    Europe
    I've scraped a few million hits from google.
    I am writing an article about it at the moment and free source code (php) to be released soon.
     
  5. SHOwnsYou

    SHOwnsYou Registered Member

    Joined:
    Jun 21, 2009
    Messages:
    94
    Likes Received:
    94
    I would suggest using the search API as well.

    When I first started scraping google links I remember having to make a slight work around. The pattern of the tags that set the actual page URLs apart from everything else changed each time I opened the page with my program.

    My first program matched the related searches which returned urls like
    Code:
    google.com/search?lang=en&type=z&recommendedurls=ACTUALRESULTURL
    
    So I just did a preg_replace of the beginning part with http:// and I got the final url.
     
    • Thanks Thanks x 1
  6. yack09

    yack09 Newbie

    Joined:
    Jul 6, 2009
    Messages:
    48
    Likes Received:
    2
    That would be cool.