Best Practices for Using CAPTCHA Solving Services

Discussion in 'Black Hat SEO' started by Gr33nHat, Nov 12, 2010.

    Hi Black Hatters,

    I would like to give you guidelines on the best practices on how to use CAPTCHA Solving Services. I have come to notice that there isn't much information about this. All the information that is available is just regarding how to input your username or password or API key, but little is known on how these services work and how WHAT YOU DO may have a negative effect on your results and the results of other.

    First thing. How do they work?

    Step 1- You upload a CAPTCHA by whatever means you have.
    Step 2- Your CAPTCHA goes into a queue. This queue is usually First-in-First-Out.
    Step 3- Your CAPTCHA goes to a Human Solver (or in Death By Captcha's case an OCR or a Human Solver)
    Step 4- Your CAPTCHA gets decoded and your text is returned.

    At first glance this type of system looks quite easy to maintain but:
    1. Why is there overload?
    2. Why are Solving Times low in some periods and really high in other periods of time?
    3. Why are there occasions in which all we get is an error message?

    Plenty of times is the Service provider's fault that these things happen, but MOST of the time it's due to the highly irregular ways in which users submit their CAPTCHAs. Meaning that most of the times there is almost nothing the Service Provider can do to prevent this.

    You must be thinking: - "Wait wait wait... this is my fault? How come?"

    Well, the 3 main reasons of these problems are:
    1. Bursts
    2. Bursts
    3. Bursts

    Yeah, Bursts.

    • Bursts of CAPTCHAs fill up the queue in ways that the system gets overloaded and impossible to use.
    • Burst of CAPTCHAs from other users may delay the CAPTCHAs solving time for you up to 120 seconds from one minute to another.
    • Burst of CAPTCHAs take so many connections in such little time that server resources run out and gives everyone else a CAPTCHA error.

    ...Yup, Bursts.

    In one particular case, we had a user that wanted to decode 10,000 CAPTCHAs in 5 minutes... IN PEAK HOUR! First impression we had is that we were getting a DoS Attack, and what we found out is that this user was late with an account creation order. The only thing that the user accomplished was get his order delivery delayed even more and cause massive overload on an already fairly loaded hour. Most CAPTCHA Solving Services can handle this load when prepared; the big problem is with unexpected and irregular loads. Technology still hasn't advanced to a degree in that we can create extra instances of "Human Solvers" instantaneously :).

    Bursts are only the worst case scenario. You should never try them even if you've gotten good results in the past. But even if you're not "bursting" as much as the last user I mentioned, you should plan do the complete opposite.

    After using CAPTCHA decoding servers for over 3 years sending pure BULK, I've used this method and it has never failed:

    1. Plan the amount of CAPTCHA requiring activities you need to do in a day.
    2. Make a 1 hour run of your software along with your CAPTCHA bypass service. Arrange a moderate number of instances for your software; any number that has worked flawlessly before.
    3. Count the amount of successful "activities" in that hour.
    4. Figure out the amount of instances you need to complete your day's goals in 21 hours. You should always calculate your instances based on 21 hours in case that you find any overload along the way, this way you will have an additional 3 hours as a safety pillow to complete your day's goals.

    - "Ok, James... How the **** do I calculate the amount of instances"

    Ins_1hour = Instances used in 1 hour.
    Suc_1hour = Successful CAPTCHA Requiring "activities" completed in 1 hour.
    Suc_24hour = Goal of Successful CAPTCHA Requiring "activities" needed in 24 hours.
    Ins_Needed = Amount of Instances needed to have complete your day's goals.
    Ins_Needed = ( Suc_24hour x Ins_1hour) / (21 x Suc_1hour)

    *This formula doesn't take into account other outside variables like proxy availability, bandwidth or system resources. PM me if you need help with it :)

    And tah dah! Now you will be less prone to unwillingly cause overload or have your goals affected by overload.

    But what if the amount of CAPTCHAs you send per day is not so great or you have an emergency in which you have to send a day's worth of CAPTCHAs in less hours?

    My advice? Try to get away from the busy hours. Busy hours are usually from 4:00 PM to 8:00 PM EST (GMT -4). Why? Most of the Human Solvers reside in India and this is "the graveyard shift" for them (from 2 AM to 6 AM in India). The availability of Human Solvers at this time is lower and the ones that are available are usually more tired than the ones that work at different time shifts, so it's more likely that you will find higher chances of getting errors with high loads on this time than in any other time. If you have no other choice, then go ahead but keep your eyes openJ.

    In Summary:
    1. Avoid sending Burst.
    2. Plan your daily goals
    3. If possible, stay away from "Indian Graveyard shifts" time when sending unexpected Bulk.

    I hope this information is useful to all of you.
    And whatever CAPTCHA Solving Service you use... Happy Captcha Killing :)

    James L
    Death By Captcha
    I got 10 instances and made 600 accounts in an hour. If I want to create 10,000 accounts per day, what should I do? I dind't understand the ofrmula.
    You would need 8 instances.
    I didn't want to make a new thread just for this, but there's a very detailed .pdf I found about understanding Captcha solving, and how the market is, which service has best price / quality, etc.

    Check it out, it's very well written. Captcha Ebook