1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Google captcha solving with DeathByCaptcha

Discussion in 'Black Hat SEO' started by maxrosecollins, Sep 11, 2013.

  1. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    Hi,

    I am building a program to check for keyword positions.

    I am writing this in Ruby and using Mechanize to visit pages/submit forms/find the content i need, etc

    Everything is working correctly except when I make too many requests too quickly, then Google presents my app with a captcha to solve.

    I have run into a problem with Googles captchas. I can send them to DeathByCaptcha and get the response okay.
    The issue is Google gives you a URL of the captcha, but if you reload this URL then the captcha changes to a different image.

    So my solution is to save the image and then send this to DeathByCaptcha as sending the URL obviously won't work.

    The bit I am stuck with is this
    • I make a request to Google and get the page which displays the Captcha and form to solve it.
    • I then have to make another request to download the image.

    In making a second request to download the image the Captcha changes so the response I get back from DeathByCaptcha whilst correct doesn't work when I enter this on the Google form and submit it.

    Does anyone know how to do this in one request? or how I can overcome this issue with a different method? been trying to figure it out for a week now with no success. So thought I would ask here.

    I would be happy to provide the full code if someone can help :)

    Below is an example of what I am doing


    Code:
    [COLOR=#00008B]begin[/COLOR]
        a.get('URL to google page that displays Captcha[COLOR=#800000]'[/COLOR])
    [COLOR=#00008B]rescue[/COLOR] [COLOR=#2B91AF]Mechanize[/COLOR]::[COLOR=#2B91AF]ResponseCodeError[/COLOR] => e
        imageurl = e.page.search([COLOR=#800000]'img'[/COLOR])[[COLOR=#800000]0[/COLOR]][[COLOR=#800000]'src'[/COLOR]]   # find captcha url on page
       a.get(imageurl).save [COLOR=#800000]"path/to/folder/image_name.jpg"  # save captcha image
    [/COLOR]   # send image to deathbycaptcha
       # get response
       # submit google form with response
    [COLOR=#00008B]end[/COLOR]
     
    • Thanks Thanks x 1
  2. prab1996

    prab1996 Elite Member

    Joined:
    Jan 8, 2013
    Messages:
    3,496
    Likes Received:
    2,027
    Occupation:
    your gf's <3 ♥♥♥♥
    Location:
    Prab1996.com
    Home Page:
    pm , tompots or fatboy .
     
    • Thanks Thanks x 1
  3. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    Okay, will do! Thank you.

    Damn, I need 15 posts first :/
     
    Last edited: Sep 11, 2013
  4. GoogleCrack

    GoogleCrack Jr. VIP Jr. VIP

    Joined:
    Feb 23, 2012
    Messages:
    268
    Likes Received:
    100
    Gender:
    Male
    Occupation:
    RankTracker.com
    Location:
    London
    Home Page:
    Have you tried stackoverflow?
     
  5. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    I have asked but they don't seem too keen on helping me break Googles captcha :p
     
  6. apoorv

    apoorv Regular Member

    Joined:
    Aug 31, 2011
    Messages:
    301
    Likes Received:
    62
    You can go a different route. Get some proxies (shared proxies should work just fine, dedicated are even better) and add them to the Mechanize instance. Loop through different headers and you will start seeing those CAPTCHA images a *lot* less. The key is to get a ton of proxies (and they don't really cost a whole lot). :)

    I'm not completely sure how this would work in Ruby, but I have a working version in Python. Something like this?

    a.proxy_host = ...
    a.proxy_port = ...

    --

    I have a faint idea of how you could make it work in the situation you want, too, but I'm not completely sure, again. :)

    Can you give me an example of the kind of captcha they throw at you? Maybe the HTML output? Recaptcha URLs are fairly easy to get. Though, again, if you go that route, you have to rely on the Captchas being solved correctly and quickly, etc. because from the same IP, you *will* get those captchas all the time. It will also build up your catch rather quickly.
     
    • Thanks Thanks x 1
  7. JimmyConway

    JimmyConway BANNED BANNED

    Joined:
    Feb 12, 2010
    Messages:
    482
    Likes Received:
    227
    If you cannot wait until you get 15 posts to message Tompots, just create a thread speaking directly with him and when he responds, give him your skype or something. Though, it's not that hard to get 15 posts. Good luck breaking their captcha.
     
  8. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    Hi apoorv,

    Yeah I have been using proxies, how many would you recommend?
    I've got 10 dedicated ones at the moment, Google seems to allow around 50-100 requests on each one before it starts to throw up the captchas.
    When the captcha comes up it won't go away on that proxy IP for a good 6 hours though :/
    I've even set it to use the proxy a few times then wait a while before reusing the same one. And putting the blocked proxies into a pool for a cooling off period.
    And waiting a while before moving on to the next keyword.

    What is your faint idea? any clues or new paths I can go are great.
    I have it working perfectly except for this captcha issue, so really starting to piss me off :p


    This is an example url of a captcha image - google co uk /sorry/image?id=12489401962981879599&hl=en
    I can't post links yet :(

    And this is the page that displays it
    google co uk /sorry/



    I'm considering rewriting the whole thing using a headless browser like phantomJS but it feels like i'm so close at the moment.
    I'm saving a captcha image okay but not the correct one! I just need to save the captcha on the first request to Google, not the second as that gives a new image :/
     
    Last edited: Sep 11, 2013
  9. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    I have spoken to him on skype but he couldn't help.
    Thanks though.
     
  10. apoorv

    apoorv Regular Member

    Joined:
    Aug 31, 2011
    Messages:
    301
    Likes Received:
    62
    The version I had had about 130 IPs. You can get them here: http://proxybonanza.com/signup/shared_proxy

    6 hours is a lot! I thought it would be more like a few minutes. :confused:

    What you could do here is take a screenshot of that specific area, if that's possible. I know it's possible to take the screenshot of the entire page using capybara-screenshot, but since you will be taking it on a page with content, I think DBC will have problems with that. That's worth exploring though.

    Another, and I think better, idea is this:

    <img src="/sorry/image?id=7949660497880160830&hl=en" border="1" alt="Please enable images"><br><br><form action="CaptchaRedirect" method="get"><input type="hidden" name="continue" value="http://google.co.uk/sorry"><input type="hidden" name="id" value="7949660497880160830">

    The value of the hidden field and the image source are exactly the same. So, one way of doing it could be to download the image, note the "new" URL (since it changes), and then fake an input field with the same attributes and submit the form with the saved captcha and the form fields. This sounds doable to me, but I don't know how easy/difficult it is.
     
  11. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    I'm a little confused.
    The url of the image only changes if you refresh the page.
    If you open the image in a window by itself and then refresh the image will change but the url will stay the same.

    example, visit this link and refresh, the image is always changing but url stays the same - google co uk /sorry/image?id=7587195847898573304&hl=en


    Do you have skype? Might be easier to talk on there.
     
    Last edited: Sep 11, 2013
  12. apoorv

    apoorv Regular Member

    Joined:
    Aug 31, 2011
    Messages:
    301
    Likes Received:
    62
    Yep. Let's talk on Skype. You can add rajeev_sh (my partner's account), and I'll be online in a little while—I'm out right now.

    Just FTR, the source code has a different URL even though the browser URL doesn't change. The source code URL was what I was talking of. It was also specific with the sorry URL, the one with some text, too.

    But, yes, Skype should be easier and faster.
     
    Last edited: Sep 11, 2013
  13. maxrosecollins

    maxrosecollins Newbie

    Joined:
    Sep 11, 2013
    Messages:
    7
    Likes Received:
    1
    Okay, done!
     
  14. GoogleCrack

    GoogleCrack Jr. VIP Jr. VIP

    Joined:
    Feb 23, 2012
    Messages:
    268
    Likes Received:
    100
    Gender:
    Male
    Occupation:
    RankTracker.com
    Location:
    London
    Home Page:
    I ordered from proxybonanza once there proxies are awful! They share them with way to many people. I ordered the 200 proxies and none of them ever worked i tried them every 6 hours for 3 days. Then requested a refund. Maybe i just got unlucky!!!
     
  15. thepjar

    thepjar Newbie

    Joined:
    Feb 10, 2014
    Messages:
    1
    Likes Received:
    0
    Unfortunately it's not possible with mechanize - you have to make 2 requests. I'm working on similar solution using Capybara with Poltergeist - it's a headles browser which saves content upon visiting a page - a bit slower than what Mechanize do but at least it can save the content from which it's possible to fetch an image. Will post my progress