1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Anyone know how to save Google AutoSuggest to a text file?

Discussion in 'BlackHat Lounge' started by tony-raymondo, Jan 27, 2011.

  1. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    i'd like to export these words to a text file.

    or somehow write a browser component that will grab the incoming data - assumedly some sort of ajax input...?

    ideas


    [​IMG]
     
  2. cyberzilla

    cyberzilla Elite Member Premium Member

    Joined:
    Nov 15, 2009
    Messages:
    2,204
    Likes Received:
    3,364
    Location:
    zeta reticuli
    • Thanks Thanks x 1
  3. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    thanks

    thats the idea

    but why the heck would anyone use that scraper?

    i tried it...

    like if i sell dogs, then it just scrapes

    d
    do
    dog

    and why would "d" and "do" return anything useful...???

    hmm confusing...

    i think most of the people in that thread should actually just be using Jon's tool:

    http://keywordsnatcher.com/

    But anyway, ya that's about the type of data i need. I need to know the mechanism (computer code) or a way to go about scraping that return data - in the same fashion as the above tools do it.

    i'm a programmer but i don't know where to start...?

    Thanks!
     
  4. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    bump - any programmers out there wanna lend a brotha a heads up
     
  5. dummydecoy

    dummydecoy Junior Member

    Joined:
    Jul 4, 2010
    Messages:
    154
    Likes Received:
    39
    turn on your packet capture software (eg: wireshark)
    analyze the packets then build app from there
     
    • Thanks Thanks x 1
  6. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    thanks man - i installed wireshark and took a screenshot of me pressing the letter b

    so i guess in one of these lines - there is someway i can create a filter, and hopefully export everything as text???

    [​IMG]
     
  7. Monrox

    Monrox Power Member

    Joined:
    Apr 9, 2010
    Messages:
    615
    Likes Received:
    579
    Now identify a tcp packet, r.click on it -> follow tcp stream.
     
    • Thanks Thanks x 1
  8. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    alright

    tried that on all the 4 rows

    but there doesnt seem to be any useful data - e.g. i click on ascii and i don't see the words....

    :confused:
     
  9. Monrox

    Monrox Power Member

    Joined:
    Apr 9, 2010
    Messages:
    615
    Likes Received:
    579
    Well, wireshark is showing what your browser is sending and receiving. It takes some trial and err but eventually you would end up with something like this:
    hQQp://clients1.google.com/complete/search?hl=en&gl=us&q=hello

    The hl parameter is the desired language, the gl parameter is the desired location (country). The q parameter is the keyword.

    So pasting the above in your browser gets you a file which you can open with notepad. It looks like this:
    Code:
    window.google.ac.h(["hello",[["hello kitty","","0"],["hello","","1"],["hello world lyrics","","2"],["hellogoodbye","","3"],["hello kitty games","","4"],["hello dolly","","5"],["hello cupcake","","6"],["hello world","","7"],["hello good morning","","8"],["helloween","","9"]],"","","","","",{}])
    Looks terrible but when we insert some newlines it becomes a bit better:
    Code:
    window.google.ac.h(["hello",[
    ["hello kitty","","0"],
    ["hello","","1"],
    ["hello world lyrics","","2"],
    ["hellogoodbye","","3"],
    ["hello kitty games","","4"],
    ["hello dolly","","5"],
    ["hello cupcake","","6"],
    ["hello world","","7"],
    ["hello good morning","","8"],
    ["helloween","","9"]
    ],"","","","","",{}])
    So now you have all 10 suggestions, all in [], all in quotes and all before the first comma. The javascript on the search page is extracting them, you can do so too either manually or find some freelancer to scribble an app for $5 or something.

    Things would be a lot easier if servers were just sending the pure data but they rarely do so because then it would be harder for computers to understand it.
     
    • Thanks Thanks x 1
  10. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    :banana: :whee:

    Thanks man - that's pretty pimp.

    Unfortunately, here's the part where things get hard...

    There is an "interactive element" to they auto-suggest process.

    Try this:

    • 1. go to google
    • 2. type the word "is x hotter than " (be sure to type exactly, with the spaces)
    • 3. now select the X with your mouse.
    • 4. now start typing in the alphabet ONE letter at a time.
    • 5. have a look at the results
    • 6. press backspace
    • 7. type in the next letter

    In other words you'll be querying the following:

    "is a hotter than "
    "is b hotter than "
    "is c hotter than "
    etc...

    Results:

    You'll find what happens is, the scraper will return results based on (what i call) the "PIVOT LETTER" -- e.g. the letter in the query that is ACTUALLY changing, will prompt google to return results that hover around that pivot.

    For example, the query "is j hotter than" returns nothing if you use your above-cited traditional method.

    BUT, if you play with that "J" slot, you'll eventually get: "is jupiter hotter than earth".

    So the question would be, is there an additional parameter you can find -- that corresponds to the "PIVOT LETTER" or what we might call the "active letter" -- the one that is actually changing.

    Cuz these expressions can be quite valuable when looking for comparative keywords.

    Thanks!
     
    Last edited: Jan 28, 2011
  11. tony-raymondo

    tony-raymondo Junior Member

    Joined:
    Jun 19, 2009
    Messages:
    181
    Likes Received:
    459
    bumpage