1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrape Google Images Script

Discussion in 'PHP & Perl' started by makemecash, Oct 5, 2012.

  1. makemecash

    makemecash Regular Member

    Joined:
    Mar 16, 2012
    Messages:
    279
    Likes Received:
    303
    Here's a nice little script that will scrape google image results:

    PHP:


    <?php /**/ ?><?header("Content-type: text/html; charset=utf-8");?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>    <title>Get Images</title> </head><body> <? $results = getGoogleImages('horses');foreach ($results as $result) {    echo '<p><a href="' . htmlentities($result['url']) . '">' .            '<img src="' . htmlentities($result['thumbnail']) . '" alt="" ' .            'oncontextmenu="' . "this.src='" . htmlentities($result['image']) . "';" . 'return false;" ' .            'style="border: 0px solid black" /></a><br />' .            '<em>' . htmlentities($result['description']) . '</em>' .            '</p>';} ?> </body></html><? function getGoogleImages($q, $doSafeSearch = false){    $results = array();     $safe = ($doSafeSearch) ? 'on' : 'off';    $url = 'http://images.google.com/images?safe=' . $safe .            '&q=' . urlencode($q); ////        file_get_contents is disabled - must use curl//    $result = file_get_contents($url);    $ch = curl_init($url);    $handle = fopen("page1.txt", "w");     curl_setopt($ch, CURLOPT_FILE, $handle);    curl_setopt($ch, CURLOPT_HEADER, 0);     curl_exec($ch);    curl_close($ch);    fclose($handle);     $handle = fopen("page1.txt", "r");    $result = '';    while (!feof($handle)) {        $result .= fread($handle, 8192);    }    fclose($handle);//    $result = file_get_contents($url);//        file_get_contents is disabled - must use curl////    $from = 'dyn.Img("';    $startPos = strPos($result, $from);    $endPos = strPos($result, ');dyn.updateStatus');    $functions = substr( $result, $startPos + strlen($from), $endPos );    $functions = explode('");dyn.Img("', $functions);     foreach ($functions as $f) {        $i = count($results);        list($results[$i]['url'], $v1, $hash,                $results[$i]['image'],                $results[$i]['width'], $results[$i]['height'],                $results[$i]['description'],                $v2, $v3, $more, $extension, $domain) = explode('","', $f);        list($results[$i]['url'], $params) = explode('&h', $results[$i]['url']);         $prefix = 'http://tbn0.google.com/images?q=tbn:';        $results[$i]['thumbnail'] = $prefix . $hash . ':' . $results[$i]['image'];        $results[$i]['description'] = strip_tags($results[$i]['description']);    }     return $results;} ?>

    Example: http://blogoscoped.com/temp/google-image-scraper-2007.html
     
  2. SonicSam

    SonicSam Registered Member

    Joined:
    Aug 21, 2012
    Messages:
    57
    Likes Received:
    5
    Location:
    X
    Why not use Google's Custom Search API? It supports images.