CAPTCHA Farms & Challenges of CAPTCHA Bot Detection

Credential stuffing Bot management

What is a CAPTCHA farm?

A CAPTCHA farm is an automated service that bot developers can query to have a pool of human workers, usually in developing countries, solve CAPTCHAs for bots. Some notable CAPTCHA farms are 2Captcha and DeathByCaptcha—and these services are a direct response to the prevalence of traditional CAPTCHAs across all online industries.

Easy to implement and often free, traditional CAPTCHAs are widely used as a basic bot protection measure. But they’re not immune to bots (in fact, data shows 50% of passed reCAPTCHAs are actually completed by bots). That’s because fraudsters are finding increasingly sophisticated ways to bypass traditional CAPTCHAs, such as leveraging artificial intelligence (AI) for automated image and audio recognition.

An increasingly common strategy relies on a more time-tested problem-solving device: human labor. In an interesting turn of the tables, CAPTCHA farms aren’t about bot-assisted humans; they’re about human-assisted bots.

How do CAPTCHA farms work?

Bot developers query CAPTCHA farm services via an API to automate the enlistment of humans to solve CAPTCHAs. Instead of using AI to solve CAPTCHA challenges, CAPTCHA farms distribute CAPTCHAs to a pool of human workers at a very low price.

captcha bot protection

Due to the low cost of labor in these countries, CAPTCHA-solving services can come as cheap as $1-3 per 1,000 CAPTCHAs solved, depending on the type of CAPTCHA (image or text-based CAPTCHAs, reCAPTCHAs, hCaptchas, GeeTests, etc.).

Let’s follow the trajectory of a CAPTCHA farm assisted bot that gets challenged with a CAPTCHA. Here’s what happens:

The bot is blocked by a CAPTCHA challenge.
The bot makes an API call to the CAPTCHA farm with the website’s CAPTCHA public key and domain name as parameters.
The CAPTCHA farm asks one of its workers to solve the CAPTCHA.
After ~30-45 seconds, the CAPTCHA is solved and the bot obtains its response token.
The bot solves the CAPTCHA by submitting the response token.

In short, solving a CAPTCHA is as simple as calling a function in the bot’s code. The attacker doesn’t even need to interact directly with the CAPTCHA by clicking on it. If the attackers know the structure and the URL of the CAPTCHA callback, e.g. the request where the website sends the CAPTCHA response token after a successful response has been submitted, they can prove that they’ve solved a CAPTCHA without even using a real browser.

CAPTCHA farms enable bot developers to significantly reduce their infrastructure costs. For an attacker conducting large-scale crawling or credential stuffing attacks, using real automated browsers or automated headless browsers is costly. It requires significant computational resources (RAM/CPU) compared to a bot that only uses a simple HTTP request library such as Curl, the Python urllib.request module, or the Axios library in Node.js.

CAPTCHA farms enable bot developers to run their bots with cheaper infrastructure, which is why the low service fees deliver an excellent return on investment across.

CAPTCHA bots target every industry.

If you believe that your website isn’t impacted by CAPTCHA farms, you’re probably wrong. At DataDome, we see CAPTCHA bots in every domain.

Here are three quick snapshots from three very different industries.

Case Study #1: Public Transportation Website & App

A customer in the public transportation industry activated the DataDome bot protection solution on all its websites and applications.

Before the protection was implemented, bots could easily crawl the site. It was also targeted by frequent credential stuffing attacks. Once we activated the protection, the bot traffic volume decreased significantly. This is a common observation: When bot operators realize that their bots are being blocked, many will simply stop, choosing to look for another victim instead.

A few days after activating the protection, we started to see an uptick in solved CAPTCHAs for the customer.

The volume of solved CAPTCHAs (green curve) starts to increase around April 11.

A superficial analysis could have easily concluded that the solved CAPTCHAs were false positives, i.e. humans mistakenly identified as bots. However, our detection engine was able to determine that they originated from CAPTCHA farms.

In the span of six days, our CAPTCHA bot detection invalidated approximately 12,000 CAPTCHAs solved by CAPTCHA farm workers. (And no, our customer received zero complaints about legitimate users being blocked.)

Case Study #2: Price Comparison Website

Our second case study is a high-traffic price comparison website that constantly came under fire from CAPTCHA bots. Between November 2019 and April 2020, bots attempted to forge more than 265,000 CAPTCHAs using CAPTCHA farms.

Significant CAPTCHA farm activity (the red curve) on a price comparison website.

The graph above shows an interesting phenomenon: In some instances, the volume of CAPTCHAs submitted by CAPTCHA bots is higher than the volume of CAPTCHAs served. How can that be?

It’s because we only consider a CAPTCHA as served when the browser has executed the JavaScript responsible for rendering the CAPTCHA. This means the bots that submitted the “surplus” CAPTCHAs did not even come from real browsers.

A website relying exclusively on CAPTCHAs for bot protection would have been tricked into accepting the visitors as humans, since the CAPTCHAs were effectively “passed.”

In contrast, DataDome’s detection engine analyzes ~250 signals for every request to determine whether a visitor is human or bot. A CAPTCHA is only accepted as passed when we are sure that it has been solved by a human actually browsing the website, not by a CAPTCHA farm worker.

Case Study #3: Retail Website

Our last example is a large retailer with both online and physical stores. During the first three weeks of February 2020, its applications and websites received more than 26,000 CAPTCHAs forged by CAPTCHA bots.

Regular CAPTCHA farm activity on a retailer website.

Again, a website with no other bot protection solution would have taken the solved CAPTCHAs at face value and let the bots through. This illustrates our main point: CAPTCHA farms make traditional CAPTCHAs a very inadequate bot protection measure.

Detecting CAPTCHA Farms

Detecting bots that leverage CAPTCHA farms is challenging. In fact, many bot management providers accept solved CAPTCHAs as proof of the visitor’s humanity. (We know otherwise.)

Other bot management providers will give the user a certain “credit” of allowed requests, based on the session cookie, after passing a traditional CAPTCHA. In this case, the bot just needs to send its trusted session cookie along with malicious requests in order to avoid being challenged for a while.

Solved CAPTCHAs can also be used as a feedback mechanism for false positives (we do this at DataDome). If a human is blocked by mistake, our detection system corrects the error by letting the user continue to browse the website after solving the CAPTCHA. Since our bot detection system uses machine learning, the algorithm uses any mistakes to self-correct.

The DataDome dashboard shows the number of CAPTCHAs passed, which enables users to monitor detection quality.

CAPTCHA farms make the CAPTCHA feedback loop less reliable. If the system accepts solved CAPTCHAs as absolute proof of humanity, CAPTCHA farms will increase the number of false negatives (bots passing as humans).

On the other hand, if the bot protection system invalidates CAPTCHAs too diligently, it increases the risk of false positives (hard-blocking legitimate human users).

In order to efficiently stop bad bots while preserving the user experience for humans, accurate detection of CAPTCHA farms is essential.

Although we can’t give away too much detail about how we detect CAPTCHA farms, below are a couple examples of the high-level approaches that DataDome’s real-time detection engine relies on to detect bots that submit CAPTCHAs solved by CAPTCHA farm workers.

Fingerprinting

Our detection engine always conducts a deep analysis of visitors’ browser fingerprints. If the engine is 100% sure that the user is a bot (e.g. if it detects a modified Selenium, Puppeteer, or Playwright bot,) it will automatically invalidate the CAPTCHA passed.

Besides detecting well-known browser automation frameworks and headless browsers, our engine analyzes countless other fingerprinting signals in order to make its decision.

Solve Speed

Alongside the browser fingerprints of visitors, we also look at how quickly a visitor solves a CAPTCHA. The graph below shows two cumulative distribution functions: one for how quickly a CAPTCHA farm worker solves a CAPTCHA, another for how quickly other users solve them.

The blue line grows significantly faster than the orange line. This means that CAPTCHA farm workers are much faster at solving CAPTCHAs than regular users. Indeed, CAPTCHA farm workers solved around 50% of CAPTCHAs in less than 5 seconds where normal users only solved around 30%.

AI Outlier Detection

DataDome also leverages AI outlier detection to identify suspicious CAPTCHA-solving traffic.

For example, if our detection engine detects a sudden increase in outdated browsers (based on the user agent) coming from unusual countries for a protected website, it may flag these CAPTCHA responses as potentially coming from CAPTCHA farms. DataDome recently did an analysis of 2 million CAPTCHAs passed over 3 months to better understand which countries tend to have the most CAPTCHA farm workers.

We did looked at the IP addresses of the specific workers of CAPTCHA farms and map those IPs to particular countries (anything more granular than a country has a much higher chance of being inaccurate). Here’s what we found:

This was further broken down into the ISPs and telecom providers of CAPTCHA farm workers.

Using the data above, along with more advanced heuristics and statistical techniques based on different signals (like IP score and device fingerprint), DataDome’s detection engine determines whether or not it should validate each CAPTCHA attempt.

If the solved CAPTCHA is invalidated, the engine generates a pattern to automatically block similar CAPTCHA attempts in the future.

CAPTCHAs are not an anti-bot silver bullet.

While traditional CAPTCHAs still have their uses, they won’t keep your website or your mobile app safe from malicious bots. Even CAPTCHAs that claim to be smarter, such as reCAPTCHA v2 vs v3, have their downsides:

The difficulty of setting proper block/allow thresholds.
The need to monitor false positives and false negatives (feedback loop).
The risk of blocking legitimate good bots (search engine bots, SEO bots, technical partners, etc.).

As we have seen, for very small fees, motivated bot developers can easily leverage CAPTCHA farms to bypass security systems that put too much trust in CAPTCHAs or use them as a first line of defense. An effective bot protection solution, on the other hand, will block bots—including bots that rely on CAPTCHA farms—while remaining invisible to human users.

The DataDome bot protection software—now with its own integrated CAPTCHA—analyzes 100% of the requests that hit our customers’ applications. We process 1 trillion signals per day, collecting and analyzing more than 250 different events for each and every request in order to accurately distinguish between humans and bots. We protect 99.99% of your users without ever presenting them with a CAPTCHA.

New threats are identified via statistical and behavioral detection, using data from server-side fingerprints, a JS rendering engine, SDK inputs, and session tracking. We make extensive use of online machine learning, and detect a new bad bot pattern every 10 milliseconds. Our false positive rate is below 0.01%, meaning that of 10,000 CAPTCHAs served, less than one is seen by a human.

If you’d like to see the real-time bot activity on your own website, you can set up a free DataDome trial yourself in less than an hour (no commitment, no credit card). All you have to do is create your free account and follow the installation instructions. Then, you can access your personal dashboard for a full overview of good bot, bad bot, and human traffic to your site.

Ready? Start here.

DataDome

Christine Falokun

Product Marketing Manager

Christine D. Falokun is an accomplished Product Marketer at DataDome, with experience bringing new products to market, spanning from the initial planning and development phases to execution and successful delivery. With a passion for crafting compelling narratives and a knack for translating complex technical concepts into clear and engaging messaging, she plays a vital role in driving the success of DataDome's products.

Related posts

How DataDome Stopped a CAPTCHA Farm Behind 18.8 Million Ticket Scalping Attempts

Tell me more

How Much Fraud Is Your Bot Vendor Missing? One Real-Money Gaming Platform Found Out the Hard Way

Tell me more

Q2 2026 Product Roundup: Agent Trust, Priority Protect, & More

Tell me more

How to Detect & Prevent Credential Stuffing Attacks

Tell me more

DataDome

DataDome

Still exploring?

Start with an on-demand demo.

Watch a demo