1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

CURL and twitter captcha

Discussion in 'PHP & Perl' started by Saulyx, Dec 14, 2010.

  1. Saulyx

    Saulyx Junior Member

    Joined:
    Jan 10, 2010
    Messages:
    107
    Likes Received:
    5
    Evening,

    ran into a dead end, making twitter account creator, in php because I know php best, however I cant seem to get the image.. the thing is that the image is only pulled in when JS button is clicked, and I can seem to be able to execute it with CURL, anyone knows solution?


    Thanks a bunch!
     
  2. MaDeuce

    MaDeuce Newbie

    Joined:
    Oct 24, 2008
    Messages:
    45
    Likes Received:
    16
    Location:
    Austin, TX
    Ha, well I don't speak PHP, but I do speak Curl...

    A number of account creation implementations are like that now. The workaround is simple:

    1) Look at the source for the page and find the javascript that grabs the captcha image.

    2) Craft a regular expression that will grab the URL of the image (and nothing else).

    3) Within PHP, once you've gotten the source for the registration page, use the regex to grab the image's URL. Grab a copy of the image and send it to whatever captcha service you are using.

    4) Wait for the answer and stuff the answer into the correct field and submit.

    Really not much to it.

    --Ma
     
  3. Saulyx

    Saulyx Junior Member

    Joined:
    Jan 10, 2010
    Messages:
    107
    Likes Received:
    5
    not entirely sure about the 2nd point of your message, you mean run the code they have to get the captcha image? looking at the file right now, thy sure know how to hide their stuff


    Thanks
     
  4. MaDeuce

    MaDeuce Newbie

    Joined:
    Oct 24, 2008
    Messages:
    45
    Likes Received:
    16
    Location:
    Austin, TX
    What I meant is that you have to write some sort of code that can locate the URL of the captcha image and that will programmatically give you that value so that you can have the captcha image decoded.

    The first step in doing that is manually looking at the source code for the web page and finding out where it is. In your case, the URL is embedded in some javascript somewhere. You have to find that javascript by reading the code.

    Once you've found the URL, you then have to figure out how to programmaticaly extract it from the web page. I.e., how go from identifying this particular URL to identifying the captcha URL in general. Personally, I would do it using a regular expression that precisely matched the URL.

    Once you have a method for doing this, you just need to integrate it with the code you already have. For each page that you process (i.e., for each account you create), you'd use the regular expression to get the captcha's URL, have it decoded, and use the resulting value.

    There are other ways to locate the captcha's URL instead of simply reading the entire page. One would be to use HTTPheaders for Firefox. Manually do a registration and watch for the header with the captcha URL. Once you know what it looks like, you can then search/grep the web page's source to quickly find it.

    There are also automation tools, like Selenium, HTMLunit, etc., that can be used to identify the javascript in question. In fact, they can be used for the automation as well, replacing Curl. I'm sure you can access them some way from PHP, but it's probably ugly.

    --Ma
     
  5. MaDeuce

    MaDeuce Newbie

    Joined:
    Oct 24, 2008
    Messages:
    45
    Likes Received:
    16
    Location:
    Austin, TX
    FYI, I just glanced at Facebook registration. If you just use, for example, Firefox, to save the source for the page, you will not see all the javascript, including the part you are interested in. You'll need to use PHP, curl, or something similar, to get the entire page source, including all javascript.

    This is what you are looking for:
    Code:
    var RecaptchaState = {
        site : '6LfbTAAAAAAAAE0hk8Vnfd1THHnn9lJuow6fgulO',
        challenge : '03AHJ_VutcCKryfDgIiNn-1QdKfsfaItr4emPs0ErA8OPSdFAH4FpA1WF4YLhjPIJMfdb3hVg0SatCt2jYM30yid7-PeGKsU9l17rXZpUuoQ3nBqCXRPHP4uwuIE5ae_UkhBI-_YAaWVOY0qs72bya3-SQ5xxCcV9kew',
        is_incorrect : false,
        programming_error : '',
        error_message : '',
        server : 'https://www.google.com/recaptcha/api/',
        timeout : 18000
    };
    
    You need to extract the relevant data from that code, and use it to create a URL for the captcha image. In this case, that URL will be:

    Code:
    https://www.google.com/recaptcha/api/image?c=03AHJ_VuvVRQ5pwQKUJG3i_NUfXg-zk_PwhmzylQ7sdpPAgUkLYUwDnMQNzHoL_3QFIcaQizkP3ixJ3P1G2r3VpdSUd-7Pu4BuzLH9bdTZUQ4IeZdpDIR-g9Vh3vAN8ZU7Cu0dWbsI-kIWGbpZKZ6Ml208JF1YgnEH8Q
    Obviously, the value following 'c=', is different for each captcha.

    Good luck.

    --Ma
     
    • Thanks Thanks x 1
  6. Saulyx

    Saulyx Junior Member

    Joined:
    Jan 10, 2010
    Messages:
    107
    Likes Received:
    5
    figured out that you get the image from
    Code:
    https://www.google.com/recaptcha/api/image?c=<key>
    now just to get the key so I could access the image after pulling the key out the website
     
  7. Saulyx

    Saulyx Junior Member

    Joined:
    Jan 10, 2010
    Messages:
    107
    Likes Received:
    5
    Beat me to it, thanks :)
     
  8. Saulyx

    Saulyx Junior Member

    Joined:
    Jan 10, 2010
    Messages:
    107
    Likes Received:
    5
    registered first account with curl, great, tried registering second one...


    Any idea why?
     
  9. MaDeuce

    MaDeuce Newbie

    Joined:
    Oct 24, 2008
    Messages:
    45
    Likes Received:
    16
    Location:
    Austin, TX
    Never tried to create accounts on Facebook. However, most sites do their best to thwart automated account creation. There are lots of ways they accomplish this. Be sure to clear cookies between each registration. You may need to vary 'referers', user agents, etc.. I'd bet that the root cause is that they are limiting the number of new accounts created per IP address. You can figure that out through some experiments. If that's what's going on, you'll have to proxy or otherwise get access to unique IPs. But it could be something completely different.

    --Ma
     
  10. nonte

    nonte Registered Member

    Joined:
    Jun 9, 2011
    Messages:
    61
    Likes Received:
    18
    Home Page:
    Saulyx can you share the code please.