1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[Google Search API] Extending google search results

Discussion in 'C, C++, C#' started by haylander, Jul 6, 2010.

  1. haylander

    haylander Registered Member

    Joined:
    May 24, 2009
    Messages:
    54
    Likes Received:
    20
    hello
    I'm trying to implement Google Search API in my bot(C#) so I can Harvest Links .
    the problem is when doing a search through google API , it return only the first 64 results,
    is there a trick extend the results ?
    if No, is there on the Net any alternative framework ?
     
  2. ShiftySituation

    ShiftySituation Power Member

    Joined:
    Apr 15, 2010
    Messages:
    621
    Likes Received:
    315
    Occupation:
    Having fun
    Location:
    Jacksonville, FL
    I wasn't aware that they had an API for Google Search. You can write your own using httpwebrequest / httpwebresponse and scraping the links then visiting the next page, scrape again, rinse and repeat for each search term. I made one a few weeks back for my forums profile generator and Google doesn't like it when you search for {inurl:register.php "Powered by vBulletin"} so it made the challenge a little more fun.

    Here is a little piece to check the page and tell if there is more pages to search. If so, get the link of the next page...
    Code:
    if (googlePage.IndexOf(">Next</span></a>") > -1)
                            {
                                googleQ = googlePage.Substring(googlePage.IndexOf("<td class=b") + 5);
                                googleQ = googleQ.Substring(googleQ.IndexOf("<td class=b"));
                                googleQ = googleQ.Substring(googleQ.IndexOf("/search?"));
                                googleQ = googleQ.Substring(0, googleQ.IndexOf("\""));
                                googleQ = "http://" + searchURL + googleQ.Replace("&", "&").Replace(""", "%22").Replace(":", "%3A");
                            }
    
    '

    The rest is going to cost you...
     
  3. zencontent

    zencontent Newbie

    Joined:
    Feb 13, 2013
    Messages:
    42
    Likes Received:
    5
    Gender:
    Male
    there is an api - certainly the way Google seems to be happier with if you go to Google code there are plenty of libraries (php js etc) for working with search. You will need to sign up for an account, if you're doing low levels of search (less than 100 a day) it's free from memory.
     
  4. theMagicNumber

    theMagicNumber Regular Member

    Joined:
    May 13, 2010
    Messages:
    347
    Likes Received:
    195
    DON'T use it. I tested it an year ago, maybe it is better now. It DOES return different results than the search engine. free 100 queries per day and if you need more you have to pay.
    As for pagination you have to use the start paramater. Check the documentation:
    https://developers.google.com/custom-search/v1/using_rest
     
  5. zencontent

    zencontent Newbie

    Joined:
    Feb 13, 2013
    Messages:
    42
    Likes Received:
    5
    Gender:
    Male
    theMagicNumber I bend to your superior knowledge - I have only used it recently for research purposes and found that the results were the same as search PROVIDED you dealt with the arguments intelligently. Although it does cost $5 for 1000 searches per day, I found this a lot easier to handle than endless captures. I think it may come down to what you want to do with it. If you want to query Google search on automatic pilot then a system that creates captures it not really going to work, maybe there is another way round the issue? I would love to hear your thoughts.
     
  6. theMagicNumber

    theMagicNumber Regular Member

    Joined:
    May 13, 2010
    Messages:
    347
    Likes Received:
    195
    The API allows you to get the first 100 results, 10 results per page, so you have to make 10 requests to get all the search results. 10 keywords per IP is not enough, unless you want to pay, but $5 per 1000 requests is A LOT.

    I am scraping google regularly doing 2M - 4M requests per day. Instead of proxies i am using a DSL connection, google blocks the IP per ~1000-1500 queries -> Disconnect/Connect(programatically) -> New IP -> rinse and repeat. If i need to scrape results with IP from a different country, for example UK, i am using translate.google.co.uk as a proxy - the original text is in the html.
    Indeed the html is harder to parse. The DSL connection can save you a lot of money.
     
    • Thanks Thanks x 1
    Last edited: Mar 6, 2013
  7. funkseo

    funkseo Newbie

    Joined:
    Sep 27, 2012
    Messages:
    20
    Likes Received:
    1
    I agree with theMagicNumber, google api results are not the same with browser search results and it limits the first 64 or 100 result(I don't remember).
    And byt the way if you use SSL proxies there won't be a problem. I'm currently using this and after every 1000 search I'm changing the proxy. So my bot can search 1 million a day. And 10.000.000 PR check in a day.
     
  8. funkseo

    funkseo Newbie

    Joined:
    Sep 27, 2012
    Messages:
    20
    Likes Received:
    1
    And another thing is that; if you use Google api and get first 25 rows and lets say you have found a website in 13th line and you search again and get first 50 rows you will see that same website on different line. Google search api is not trustable source. I had used it on my site Google Position Finder tool.