DataDome
CAPTCHA Guide

ReCAPTCHA v2 vs. v3: Efficient bot protection? [2024 Update]

Table of content
20 Aug, 2022
|
min

The promise of Google’s reCAPTCHA v3 is to prevent bot traffic to your website without the user friction we all associate with v2. But does reCAPTCHA v3 keep its promise?

Let’s take an honest look at what reCAPTCHA v3 can and cannot do for your website security. We’ll detail the differences between reCAPTCHA v2 vs. v3, uncover the pitfalls of reCAPTCHA v3 configuration, and sum up what a truly effective bot protection and mitigation software must deliver.

What is reCAPTCHA?

First things first: A CAPTCHA (acronym for “Completely Automated Public Turing Test to Tell Computers and Humans Apart”) is a security measure designed to differentiate bots from humans, typically with an image or audio challenge. CAPTCHA was a Carnegie Mellon research project first launched in 2000 as a general purpose authentication technique for humans.

Intended for very low-level validation, CAPTCHAs have not historically been coupled with security logic. But today, CAPTCHAs are widely used on the internet to prevent bots from signing up for accounts, spamming comments, and buying products.

ReCAPTCHA is what Google calls their CAPTCHA system. It was released in 2007 and is currently used by more than 13 million live websites. Despite some controversy around its privacy compliance with GDPR and similar policies, reCAPTCHA has been the most-used CAPTCHA system to date.

Is reCAPTCHA free?

While Google promotes reCAPTCHA as a free service, it’s only free up to one million API calls a month. Company websites that generate more than one million calls a month must sign up for reCAPTCHA Enterprise.

ReCAPTCHA Enterprise costs $1 for every thousand calls up to 10 million calls (custom fees apply above 10 million calls a month). So if your website generates three million calls a month, your reCAPTCHA bill will be $3,000 every month.

ReCAPTCHA v2: Hard on Humans, Too Easy on Bots

Many websites are still using reCAPTCHA v2, which was launched in 2014. If a user’s behavior triggers suspicion, reCAPTCHA v2 will provide a challenge that the visitor must solve to prove they’re human.

We’re all familiar with the various versions of reCAPTCHA v2. Sometimes, all you need to do is check a box that says “I’m not a robot.” Other times, reCAPTCHA will challenge you with an image or audio recognition task. Whether or not you get the full challenge will depend on how confident Google is that you really are a human.

Aren’t we all computers in a simulation anyway?

ReCAPTCHA v2 is based on an “advanced risk analysis system” that relies quite heavily on Google cookies. If someone is browsing the web using Chrome, or has been logged into a Google account for a while, they’ll most likely just have to tick a box. On the other hand, a Firefox user who has disabled third-party cookies is much more likely to get a difficult image recognition challenge.

Not everyone uses Chrome and not everyone is comfortable using Google’s services, likely because people across the world are increasingly concerned about their online privacy. (Many find it impossible to protect their online privacy.) Still, the privacy-conscious might use browsers like Firefox or Brave, or even a VPN to browse the internet—only to face tougher challenges from reCAPTCHA v2, which degrades their user experience and lowers conversion rates.

Additionally, due to the ubiquity of reCAPTCHA v2, cybercriminals have found increasingly efficient automated solutions to bypass even the most difficult reCAPTCHA v2 challenges.

AI to Solve ReCAPTCHA v2 Challenges:

Some bots leverage recent progress in artificial intelligence to solve reCAPTCHA v2 challenges. More specifically, advanced neural networks help train AI models to automatically solve reCAPTCHAs.

It’s quite ironic: Google uses reCAPTCHA to train their image and audio recognition AI models, and cybercriminals use those advances in AI to beat the reCAPTCHA. The circle of digital life!

CAPTCHA Farms:

Cybercriminals can also outsource reCAPTCHA challenges by sending them to human workers at CAPTCHA farms in low-cost countries. Thanks to CAPTCHA farms, attackers can use bots that don’t need to execute JavaScript. To pass a reCAPTCHA v2, all a bot has to do is to send a callback request containing the response token provided by the CAPTCHA farm.

Without the need for JavaScript, attackers can create bots that leverage simple HTTP request libraries instead of having to use fully automated browsers like Selenium or Puppeteer with headless Chrome. This decreases their infrastructure cost, so they can crawl pages faster or stuff more credentials. The fees they pay the CAPTCHA farms are minor in comparison: it costs approximately $1-3 to solve 1,000 v2 reCAPTCHAs.

If you want to learn more about CAPTCHA vs. reCAPTCHA or CAPTCHA farms and CAPTCHA farm detection, watch this webinar recording:

Webinar: Are CAPTCHA farms outsmarting your website?

ReCAPTCHA v3: Easy on Humans, Except Website Admins

After listening to some of the user complaints, Google developed reCAPTCHA v3 to provide a better user experience. Unlike v2, reCAPTCHA v3 is invisible for website visitors. There are no challenges to solve. Instead, reCAPTCHA v3 continuously monitors each visitor’s behavior to determine whether it’s a human or a bot.

Currently, reCAPTCHA v3 is in use on just over 1.2 million live websites, versus the 10 million+ sites using v2.

How does Google reCAPTCHA v3 work?

For each request the user makes, reCAPTCHA v3 returns a score between 0 and 1 to represent the likelihood that the request originated from a bot. If the score is close to 0, it’s likely a bot—and if it’s close to 1, it’s more likely to be a human. As in v2, users who are logged into their Google accounts and/or using Chrome are more likely to have a score close to 1.

To improve the accuracy of the score, website administrators can define specific actions, such as “sending a friend request” or “go to homepage” to help the reCAPTCHA understand how normal user behavior will vary depending on the context. While reCAPTCHA v3 clearly improves the experience for human users by eliminating the need to disrupt their browsing with reCAPTCHA challenges, it still raises privacy concerns and creates problems for website administrators.

With reCAPTCHA v2, administrators only need to verify whether the user correctly solved the challenge or not. With reCAPTCHA v3, administrators need to decide which action to take depending on the users’ score. Getting this configuration right is a tricky task for even the most experienced webmaster.

Mapping ReCAPTCHA v3 User Scores to Actions:

For each action a user takes on your website, you have three possible responses:

  1. Give the user access to the requested resource.
  2. Ask the user to solve a reCAPTCHA to determine if they’re human.
  3. Block the user (hard block).

This means that you need to decide for each action where you want to place the threshold for a particular response. Will you block the user when their score falls below 0.25, or will you serve them a reCAPTCHA? What about 0.15? Will you fully block them then, or does 0.10 seem more appropriate? What happens if a user fails a reCAPTCHA challenge? There are no clear-cut answers, which is what makes these questions so difficult.

The stricter you make your thresholds, the more likely you are to block actual users. The contrary is also true: the looser your thresholds, the more likely you are to leave bots undetected. With reCAPTCHA v3, you need to make an unpleasant compromise between not blocking too many users and not allowing too many bots.

No Feedback Loop:

The reCAPTCHA v3 dashboard displays a distribution of user scores for each configured action on your website. But that’s not enough to help you assess whether you’ve set the right thresholds, because there’s no other information to help you better understand the users you’ve blocked or let through. There is no way to look into possible false positives or negatives.

It’s important to consider that the Internet is far more diverse than we often imagine it to be. Sure, the majority of your legitimate users might browse the internet with Chrome, Edge, or Safari, but what about the ~8% of people who don’t? Their user scores will be significantly lower. Do you really want to make their lives harder with a reCAPTCHA or by blocking them outright?

Setting blocking and authorization thresholds without a proper monitoring mechanism is like playing Russian roulette with your website’s traffic. Collecting, storing, and analyzing enough data to set your thresholds accurately requires deep bot detection knowledge and significant software development costs.

Detection Quality:

ReCAPTCHA v3 uses behavioral detection to predict whether a given request originates from a human or not. While behavioral detection is extremely helpful for detecting advanced bots, learning how to distinguish bots from humans accurately requires huge volumes of data.

To make an accurate decision, reCAPTCHA v3 needs data—it needs a user to interact with your website for a while. With reCAPTCHA alone, your site is vulnerable to large-scale distributed crawlers that rotate through several IP addresses.

Here at DataDome, we did a quick experiment to determine whether reCAPTCHA v3 also uses basic client-side fingerprinting signals—and it does. While v3 can easily detect “naive” bots, such as those that use unpatched Selenium bots or don’t remove the navigator.webdriver attribute, bots that forge their fingerprint easily bypass reCAPTCHA v3 detection.

We created a Headless Chrome bot and used the Puppeteer extra framework to forge its fingerprint, then had it screenshot its reCAPTCHA v3 score. It had obtained a nearly “human” user score of 0.9: a perfect intruder.

ReCAPTCHA & Privacy Compliance

Privacy is a fundamental human right—and that includes online. Countries around the world have begun to implement privacy-focused regulations, such as the General Data Protection Regulation (GDPR) in the European Union. Online tools like reCAPTCHA should be compliant with these regulations.

However, in July 2020, France’s Commission Nationale Informatique & Libertés (CNIL) found that Google’s reCAPTCHA was not privacy compliant. Because reCAPTCHA can gather user data that is transmitted to Google for “analysis”, GDPR says end-users of platforms that use reCAPTCHA must be informed that their data is being collected, and they must have an option to consent or opt out of sharing their data.

Google leaves it up to the companies that use reCAPTCHA to inform and get consent from their end-users, which often does not happen, as the CNIL discovered. And since the purposes of reCAPTCHA’s data collection are not precisely defined, it is difficult to imagine how user consent—even if collected—could meet the GDPR requirements of being “free, specific, informed, and unambiguous.”

Just as businesses need to trust that real humans are interacting with their platforms, their customers and end-users need to trust that their data won’t be compromised.

Make sure you can trust any solution/provider that can impact your customer relationships (including your bot and online fraud management and your CAPTCHA provider) to prioritize your end-users’ privacy as much as you do.

Can bots bypass reCAPTCHA?

In short, yes they can. While reCAPTCHA v2 and v3 can help limit simple bot traffic, both versions come with several problems:

  1. User experience suffers, as human users hate the image/audio recognition challenges.
  2. CAPTCHA farms and advances in AI allow cybercriminals and advanced bots to bypass reCAPTCHAs easily.
  3. Defining the right thresholds for reCAPTCHA v3 user scores is a very difficult task.

There’s no way to monitor false positives and negatives.The bottom line is that neither reCAPTCHA v2 nor v3 are replacements for a proper bot management solution.

The ReCAPTCHA Alternative That Really Stops Bots:

DataDome offers the only secure, user-friendly, and privacy compliant CAPTCHA. DataDome’s CAPTCHA integrates with our bot and online fraud protection solution, which processes 5 trillion+ signals per day to decide with 99.99% accuracy if each request comes from a human or bot. As a result, we selectively deploy CAPTCHAs solely to the traffic segment identified as automated bots, preserving the smooth experience for genuine users. DataDome’s solution protects websites and apps in all industries from e-commerce to classified ads to mobile apps and beyond. Here’s how we address each of the aforementioned problems:

User Experience

DataDome is invisible to 99.99% of human users, meaning that for the vast majority of humans, there’s no challenge to solve. DataDome uses a wide range of techniques to distinguish bots from people—behavioral analysis, device fingerprinting, IP reputation, and more.

For customers that prefer a smoother, more user-friendly online experience, we’ve released an alternative detection layer called Device Check. This new response challenge functions much like a CAPTCHA but without any visible or interactive challenge to the end user. The authentication process occurs within the user’s device, confirming device-specific signals via proof of work, all without any visible prompts for the end user. This seamless integration is compatible with web browsers and mobile applications, prioritizing user privacy, and serving as a primary mechanism for identifying automation frameworks, counterfeit environments, and programmatic attempts to access interfaces.

With these two challenges, we are intensifying our commitment to delivering an exceptional user experience. In fact, DataDome’s customers frequently report that the user experience of their customers has improved significantly with our CAPTCHA solution. That’s because before implementing DataDome, bots represented 40% of some customers’ traffic, which took a heavy toll on their server resources and slowed down the performance of their platforms. Activating DataDome supports fast loading speeds and a smooth user experience because we prevent bots from swamping our customers’ servers.

For the rare event that a user could encounter DataDome’s CAPTCHA, it is also focused on accessibility for human users, with audio challenges in 13 different languages—well beyond reCAPTCHA’s 8.

CAPTCHA Farms & AI Detection

The right approach to AI detection—with closely monitored ML models processing various signals and scaling new alerts to adjust protection across all endpoints around the globe—can render CAPTCHA farms useless. For example, DataDome uses CAPTCHAs in two ways:

  1. To allow the 0.01% of human users that may get blocked to continue their navigation.
  2. As a feedback loop that sends signals to our threat detection engine, which processes them in combination with 5 trillion other behavioral, technical, and statistical signals to determine with 99.99% accuracy whether each request comes from a bot or human.

We do not consider a solved CAPTCHA to be indisputable proof of a user’s humanity. We’ve developed different approaches to make sure that CAPTCHAs are solved by actual people, not by CAPTCHA farms or neural networks.

Every day, we invalidate thousands of forged CAPTCHA responses.

Blocking & Allowing Thresholds

If you are looking for a reCAPTCHA alternative that works on autopilot, DataDome has you covered. Once you’ve installed our server-side module and mobile SDK, and allowlisted your partners’ bots, you don’t need to add any other detection logic or thresholds. Our advanced detection engine figures out whether your visitors are human or not by itself, so there’s no complex configuration for you to go through.

Unlike reCAPTCHA v3, DataDome also recognizes good bots, such as less popular search engine bots, content aggregators, and SEO bots. This means you don’t have to worry about forgetting something or making a mistake and accidentally degrading your SEO rankings.

Of course, if you want, DataDome gives you the possibility to add custom detection logic or allow list some of your traffic based on criteria such as IP, country, user agent, and so on. Customers who take advantage of Device Check also have the ability to customize when this invisible challenge occurs and to which traffic segment. This new response can be finely tuned, ranging from a measured, selective usage to a more assertive status as a mandatory verification step, all orchestrated to ensure the authenticity of every user’s device.

Feedback Loops

Thanks to our advanced detection engine, DataDome (unlike reCAPTCHA) has an extremely low false-positive rate: 0.01%. For every 10,000 CAPTCHAs served, less than one is seen by a human. In the rare instance a human sees a CAPTCHA, our real-time feedback loop propagates the information to our detection engine in less than 2 milliseconds to ensure we don’t hard block humans.

To deal with false negatives (letting bots through), DataDome’s detection engine constantly learns new bot patterns using AI and leverages bad traffic detected on one platform to protect all others. But we also keep humans in the loop: our bot SOC team of data analysts conducts frequent traffic reviews to ensure we don’t miss any bots.

DataDome also comes with an intuitive dashboard that enables you to monitor your main traffic metrics, such as the volume and nature of bad bot requests, the number of CAPTCHAs served, etc. If you want to explore your traffic in more detail, you can use a real query language to explore a wide range of dimensions, such as IP address, country, type of bots blocked, and more.

Detecting Advanced Bots

Every day, DataDome encounters new advanced bots. Our detection engine uses behavioral AI detection, advanced fingerprinting, IP reputation, and more to make sure we detect even the most cunning bots. Contrary to other bot management solutions that analyze requests in batches, DataDome analyzes each request individually in less than 2 milliseconds to determine if it originated from a bot or a person.

A bot that received a 0.9 user score with reCAPTCHA v3 would be caught immediately with DataDome. Our fingerprinting module can detect advanced bots that use residential IP proxies, forged fingerprints, real browsers, and headless browsers automated with modified Puppeteer. The same goes for preventing advanced Playwright bots or modified Selenium bots, even if they modify the ChromeDriver binary. We stop them at the first request.

In Conclusion

While reCAPTCHA v2 and v3 can help block some bot traffic, they cannot stop advanced scalper bots, scraper bots, DDoS, ATO attacks, etc. Neither version of reCAPTCHA should therefore be considered as a proper bot management solution because ReCAPTCHA v2 and v3 both:

  • Degrade the user experience.
  • Can lead to high false positives and false negatives.
  • Fail to be privacy compliant with GDPR, the foundational global privacy standard.
  • Leverage your users’ data for their organization’s advertising purposes.
  • Are easily bypassed with CAPTCHA farms and advanced bots.
  • Provide no real feedback mechanisms (pass/fail is not enough information to refine your security).

Wondering how to protect your endpoints without using reCAPTCHA? DataDome’s new user-friendly, privacy-focused CAPTCHA solution is easy on humans, hard on bots, and the only security-Geared CAPTCHA system on the market today. See how it works for yourself!

Frequently Asked Questions

What is the difference between reCAPTCHA v2 and v3?

ReCAPTCHA v2 requires the user to click the “I’m not a robot” checkbox and can serve the user an image recognition challenge. ReCAPTCHA v3 runs in the background and generates a score based on a user’s behavior. The higher the score, the more likely the user is human. A webmaster has to decide (and program) whether to block, challenge, or do nothing when a user’s score drops below a certain threshold.

Is reCAPTCHA v3 better than v2?

Neither of them is good at blocking bots. While reCAPTCHA v3 is less intrusive than v2 for a user, it places a significant burden on the webmaster to determine when to let users through and when to block or challenge them. There’s no right answer to this.

Does reCAPTCHA stop bots?

ReCAPTCHA might block the simplest of bots, but it makes for a frustrating user experience and it does not serve as sufficient protection against the security threats that plague companies today. For that, you need a bot protection solution.

Can reCAPTCHA be hacked?

A reCAPTCHA isn’t so much hacked as much as it is gamed. ReCAPTCHA farms and advanced bots can easily bypass both reCAPTCHA v2 and v3, because the former use humans to solve CAPTCHAs and the latter are crafty enough to seem so human the reCAPTCHA never suspects a thing.

What can I use instead of reCAPTCHA?

An advanced bot protection solution is one of the few ways you can fully protect yourself against today’s security threats. The right solution will block the most advanced bots and protect your websites, mobile apps, and APIs against threats such as account takeovers, credential stuffing, web scraping, and more.

Is hCaptcha better than reCAPTCHA?

HCaptcha suffers from some of the exact same problems as reCAPTCHA. It still degrades the user experience and does not adequately protect you against CAPTCHA farms and advanced bots. Any CAPTCHA that operates in a silo and is used as a first line of defense will result in inadequate bot protection and a negative user experience.

Like reCAPTCHA, hCaptcha is not very accessible to people with disabilities, such as visually impaired users. Especially on the free tier, hCaptcha requires each user to manually solve a puzzle. DataDome CAPTCHA, on the other hand, includes an audio CAPTCHA in 13 languages and has been approved by associations that advocate for the visually impaired.