1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[WTH] I Need a VERY Competent Programmer for a Google Adwords Keyword Tool Scraper

Discussion in 'Hire a Freelancer' started by crazyflx, Mar 26, 2011.

  1. crazyflx

    crazyflx Elite Member

    Joined:
    Nov 9, 2009
    Messages:
    1,674
    Likes Received:
    4,825
    Location:
    http://CRAZYFLX.COM
    Home Page:
    I've got what you're about to read posted on 3 job sites (scriptlance, freelancer & odesk so if you're a member of any of those and would prefer to do business somewhere where there is a real "escrow" involved, let me know and I'll post the links to the listings here).

    Anyway, here goes:

    My operating System: Windows 7 64bit (but I would like the program to work on Windows XP, Vista & 7 both 32 & 64 bit versions...this isn't a necessity, but I would like it).

    I'm going briefly describe what I need here, and then below, I will describe IN DETAIL what I need.

    Brief Overview:
    I need a multi-threaded, proxy supporting, decaptcher.com supporting Google Adwords data scraper and a Google SERP scraper (not the search results, just the number of pages that appear for a given query). It needs to be able to do this with HUGE input files (1,000,000 line input files and smaller).

    It is going to have to scrape the following from Google Adwords Keyword Tool - Traffic Estimator with United States set as the default country & English as the default language:
    Local Monthly Searches and Estimated Cost Per Click (CPC).

    It is going to have to scrape the following from the google serps page itself:
    The number of results for any given query (it appears directly beneath the search bar after searching a query as: "About [insert number of results here] results" )


    In Detail -

    User Interface:
    In the user interface, I need to be able to select 2 .txt files.

    The first file should be my input queries, which will be a line separated list of keywords/phrases. The program needs to be able to handle input files of 1,000,000 lines or smaller.

    The second file should be a list of line separated proxies, in this format:
    ip address:port

    Also from the user interface, I need to be able to input my username & password that will be used to log into the Google Adwords Keyword Tool area.

    I should also be able to select the number of threads to work with from the user interface.

    Lastly, from the user interface, I need to be able to input my username & password for my account at decaptcher.com (decaptcher.com is an automated captcha solving service).



    Program Functions:
    The program has to visit (in the background, I don't want/need to see any of this actually happening) the Google Adwords Keyword Tool (this URL: https://adwords.google.com/select/KeywordToolExternal ).

    It then needs to log in using the details that were provided in the UI.

    The program then needs to navigate to the "Traffic Estimator" part of the Google Adwords Tool. You do this by clicking on "Traffic Estimator" which is under the "Tools" heading on the left hand side of the screen directly beneath "Keyword Tool".

    Once at the "Traffic Estimator" screen it has select United States as the default country location, and English as the default language (it must ALWAYS make sure those two things are selected).

    Then it has to take the input phrase it is currently working with from the input file and input it into the search box (labeled: Word or phrase (one per line) ) in three variations:

    1. The phrase as it was in the input file (I.E. phrase here )
    2. That same phrase in brackets (I.E. [phrase here] )
    3. That same phrase inside quotation marks (I.E. "phrase here" ).

    So it will look like this in the search box:

    input phrase
    [input phrase]
    "input phrase"

    Then it should click "estimate"

    Once the data is returned, it needs to scrape the following for each of the three phrases:

    Local Monthly Searches & Estimated Cost Per Click (labeled: Estimated Avg. CPC )

    The scraped data for the phrase without quotes or brackets needs to be stored/remembered as "Searches (Broad)" & "Adwords CPC (Broad)"

    The scraped data for the phrase in brackets [], needs to be stored/remembered as "Searches (Exact)" & "Adwords CPC (Exact)"

    The scraped data for the phrase in quotations "", needs to be stored/remembered as "Searches (Phrase)" & "Adwords CPC (Phrase)"

    The program then needs to visit Google.com itself, and search that same input phrase as it was taken from the input file two different ways.

    The first is that phrase in quotation marks (I.E. - "phrase here" ). After it searches the input phrase at google.com inside quotation marks, it has to scrape the number of returned results for that phrase. This number appears directly beneath the search bar after searching for something. This data should be stored/remembered as "SEO Comp".

    The second way it has to search the input phrase, is like this:

    intitle:"input phrase"

    It needs to again scrape the number of returned results for that query. This data should be stored/remembered as "SEO Title Comp".

    After this has been done for the phrase it is working with, it needs to export that data to a CSV file in real time and save the file. This way, it can remove that data from the programs working memory so the program doesn't continuously use more and more memory trying to "remember" all of the data it has scraped previously.

    When it does it for the next phrase, it needs to simply append that newly scraped data to the previously saved file.

    It needs to export the data into the CSV file in this format:

    keyword phrase as taken from input file,searches (broad),searches (phrase),searches (exact),adwords cpc (broad),adwords cpc (phrase), adwords cpc (exact)

    CAPTCHA Solving:
    Occasionally when querying lots of things in the adwords keyword tool, it will ask for you to solve a captcha. The program should "freeze" and solve the captcha using the decaptcher.com credentials given in the user interface in conjunction with the decaptcher.com API (can be downloaded for free at decaptcher.com after signing up which is completely free).

    Proxy Support:
    The program should change proxies after EVERY query, this goes for both the queries at the adwords tool AND the queries at google.com directly when getting the number of returned results.

    It should check to make sure the proxy worked, and if it didn't, it should try that same query with another proxy, and do this until it works.

    Multi-Threading Support:
    Everything that has been described above, has to happen SIMULTANEOUSLY using the number of selected threads in the UI.

    So if I select 10 threads, it should simultaneously be working with 10 different input phrases from the input file AT THE SAME TIME. It should always have 10 live threads. Meaning I don't want it to complete the current 10 threads, and then start a new 10 threads. It needs to always be working with 10 threads, so if it finishes one thread and is down to nine threads, it should start working with another query to make it 10 again.

    IF YOU HAVE ANY QUESTIONS OR DON'T UNDERSTAND SOMETHING, PLEASE ASK BEFORE MAKING A BID. DON'T JUST PLACE A BID EXPECTING TO FIGURE THINGS OUT LATER.
     
  2. Buddym

    Buddym Registered Member

    Joined:
    Feb 8, 2008
    Messages:
    80
    Likes Received:
    10
    Hi, I'm looking for the same tool, O you have it already?
     
  3. fundoobuddy

    fundoobuddy Newbie

    Joined:
    May 25, 2009
    Messages:
    13
    Likes Received:
    3
    OP: sent you a PM
     
  4. MattDunbar

    MattDunbar Registered Member

    Joined:
    Oct 2, 2008
    Messages:
    72
    Likes Received:
    31
    Occupation:
    Social Media Consultant & Developer
    Location:
    Toronto, ON, Canada
    Not sure how much you're looking to spend here. If you're willing to spend somewhere between 8k and 15k (I'd have to break things down, which I don't want to unless you're willing to spend this much), I generally bill $25-45/hr (would be 45 for C++ or possibly a bit less for JAVA, I can explain the differences to you) but can figure out a flat rate for you.

    I know what I'm doing in terms of programming, I generally don't do Desktop apps, but know how and I'm somewhat interested in this.

    Anyways, if you're willing to spend that much (and in turn, I promise to deliver an equally valuable product) send me a PM.

    EDIT:
    Also, regarding the escrow since you're a reputable member here I can provide references and work towards milestones where you pay for work that is already done (although, working on this model I would work in payments every 500-1000 worth of work completed).
     
    Last edited: Mar 29, 2011
  5. crazyflx

    crazyflx Elite Member

    Joined:
    Nov 9, 2009
    Messages:
    1,674
    Likes Received:
    4,825
    Location:
    http://CRAZYFLX.COM
    Home Page:
    Thanks to everybody who replied, I appreciate it.

    I've actually found a programmer to complete this project, and it is underway.

    Thanks again!
     
  6. BombaRuLz

    BombaRuLz Regular Member

    Joined:
    May 26, 2011
    Messages:
    202
    Likes Received:
    222
    Occupation:
    Web & Graphics Designer
    Location:
    Macedonia
    How's the process going? :) Nice thread :D
     
  7. ChanceLonestar

    ChanceLonestar Newbie

    Joined:
    Feb 23, 2009
    Messages:
    32
    Likes Received:
    2
    Are you gonna sell the app once it's done?
     
  8. Buddym

    Buddym Registered Member

    Joined:
    Feb 8, 2008
    Messages:
    80
    Likes Received:
    10
    I also like to know if you are gonna sell it to us?