DataDome
CAPTCHA Guide

What is a CAPTCHA & how does it work? Types & Examples

Table of content
Paige Tester, Sr. Content Marketing Manager
15 Oct, 2022
|
min

You’ve almost certainly encountered a CAPTCHA during your time online. But why do you have to deal with them so often? 

Cybercriminals program bots to roam the internet, looking for ways to manipulate your pages, access your databases, and steal your data. In fact, bots make up more than 40% of online traffic.

Any website can become a target of brute force attacks, digital ad fraud, transaction fraud, and personal data harvesting via malicious bots. CAPTCHAs were designed to shield and protect websites from malicious bots.

What is a CAPTCHA?

The acronym CAPTCHA stands for “Completely Automated Public Turing Test to Tell Computers and Humans Apart”.  It’s a challenge-response test websites use to quickly differentiate real human users from bots. 

Websites use CAPTCHA tests to determine whether an actual user or a bot is attempting to access a web page. The original CAPTCHA tests, which first appeared in the late 90s, were made up of distorted images containing a combination of random letters and numbers.

How do CAPTCHAs work?

When a CAPTCHA is triggered, a pop-up window may appear when users attempt to access specific pages or input information, prompting the user to complete a CAPTCHA test. Original text CAPTCHAs would twist and bend letters and numbers out of shape, changing proportions and making it hard for bots to figure out what was on the screen.

Color gradients and other background noise make things tough for computers and spambots. CAPTCHA codes can’t be copied, so basic bots fail the test. Later versions of CAPTCHA, like Google’s reCAPTCHA, use images and ask users to identify which pictures contain a certain object. Some versions of reCAPTCHA are also “invisible”, but not entirely effective at blocking bots. In fact, all types of traditional CAPTCHAs, including reCAPTCHA, have now been outpaced by many bot developers.

In essence, if a CAPTCHA challenge is not triggered or is successfully passed, the user is assumed to be human and is allowed to access website resources as normal. If they fail, the user is assumed to be a bot. The majority of CAPTCHA systems have no way to automatically find false positives and negatives.

What are the cons of CAPTCHAs?

Using a traditional, siloed CAPTCHA is not enough to keep bots away from your mobile app, website, or APIs. Google’s reCAPTCHA system relies too heavily on cookies and people using Google accounts and web browsers to be effective, in addition to questions around reCAPTCHA’s privacy compliance. Add to that the fact that bots can easily solve CAPTCHAs or use CAPTCHA farms to pass challenges, and you will find plenty of bots are able to reach your website.

Note: DataDome offers the first user-friendly, privacy compliant, and secure CAPTCHA, but even DataDome’s CAPTCHA does not expect to block bots in a silo. Our CAPTCHA is integrated with complete, ML-powered bot and online fraud protection that analyzes 5 trillion signals per day and processes each request anew based on all accumulated signals. (Good luck getting past that, bots.)

What are CAPTCHAs used for?

CAPTCHAs aim to prevent bots posing as humans from accessing resources meant for human users. There are many reasons we don’t want bots to access certain web pages. Bad bots can:

  • Create fake accounts and waste precious resources. Malicious hackers use the fake accounts to increase traffic, overload servers, and even deny real customers your services. They can also spam other users or initiate phishing campaigns.
  • Take over sites by spamming comments and contact forms. If left unchecked, bots can fill websites with comments and messages containing dangerous links. Users who click on the links become vulnerable to potential scams.
  • Allow scalpers to purchase large quantities of in-demand tickets and other products. Products are then resold at a higher price, frustrating real customers.
  • Skew online polls by voting uncontrollably. They can also skew product ratings on various sites, like Amazon, to make items appear better or worse.

At first, traditional CAPTCHAs were quite effective at stopping bots from performing malicious tasks on the internet. Bots were simpler back then, and could not read distorted letters and numbers to solve the challenges. However, as bots have become more sophisticated, they’ve learned how to pass many different types of CAPTCHA challenges.

Types of CAPTCHA & How Different Ones Work

Traditional CAPTCHAs come in many shapes and forms:

Text CAPTCHA

This was the most common type of CAPTCHA found on websites for many years, in which the user needs to type in the displayed word (or words) to pass the test. The “word” usually consists of disjointed, blurred, elongated, or otherwise distorted text. To make things slightly more challenging, the displayed text is often obscured by a blurry/distorted background.

As an authentication method, text CAPTCHA has received a lot of criticism. Sometimes the tests are too difficult to read and lack accessibility—especially for people with visual impairments.

Image CAPTCHA

With image CAPTCHAs, users are given multiple images and are told to pick the ones that contain a specified object. This form of CAPTCHA is very effective: image recognition is easy for humans (arguably easier than text recognition), but bots and computers have a hard time with image pattern recognition—until the last few years, that is.

Google, for example, combines its massive street view image library with artificial intelligence to generate quick CAPTCHA images on the spot. (That’s why you’re always clicking on street signs, lamp posts, and fire hydrants!) These CAPTCHA challenges are used to train Google’s image recognition machine learning models.

Audio CAPTCHA

Accessibility is key, meaning as many people as possible need to be able to solve the challenge. As an alternative testing method, CAPTCHAs should allow users to click on a small speaker button for an audio CAPTCHA. With audio-assisted text CAPTCHA, the generated voice either spells out the letters/numbers or mentions words that begin with the specified letters.

If a user clicks the headphones button on a visual CAPTCHA, they’ll have to solve an audio challenge instead. The audio file includes several numbers that must be entered correctly to complete the challenge.

Alternative CAPTCHAs

Some websites prefer to switch from CAPTCHAs in the traditional sphere to different alternatives. Other types of CAPTCHAs include:

  • Math Solution: Users have to solve a basic math problem (e.g. 3+2) to continue.
  • Word Problem: A word problem might have users rearrange letters, input the color of the text, or state the last word from a sentence.
  • Social Media Sign-in: Users can sign in simply by using their Google or Facebook accounts.
  • Time-Based: Users that exhibit bot-like behavior (completing forms within a fraction of a second) are automatically blocked.
  • No CAPTCHA reCAPTCHA: All users have to do is click the “I’m not a robot” checkbox. By tracking mouse movement, among other things, Google predicts whether the user is human.
  • reCAPTCHA v3: The newest reCAPTCHA version works behind the scenes to identify bots and trigger actions without user interaction.

What triggers a CAPTCHA test?

Ideally, suspicious behavior triggers a CAPTCHA test. Some common triggers include:

  • IP Tracking: A user’s IP has been identified as a bot.
  • Resource Loading: A user doesn’t load styles, banners, or images.
  • Sign in: The user isn’t signed in to Google/Gmail when accessing the site.
  • Bot-Like Behavior: Weird clicking patterns, little mouse movement, and perfectly-centered checkbox clicking can all trigger a CAPTCHA test.
  • No Browsing History: Real humans do more than try to log in to the same page over and over again.

How do CAPTCHAs prevent bots?

CAPTCHAS do not entirely prevent bots. Spammers and cybercriminals create computer programs that use artificial intelligence to solve traditional CAPTCHA challenges, even the ones that have evolved to become more complex over time.

CAPTCHAs are intended to be one detection signal processed among many others to block malicious bots. They’re not foolproof, especially not on their own, and they should never be your first line of defense.

DataDome’s Reimagined CAPTCHA Integrates With Complete Bot Protection

Not only are traditional CAPTCHAs (like reCAPTCHA) unable to block advanced bots, but they are known to kill conversions and drive users away from your website due to a poor user experience.

A better alternative is an effective bot and online fraud protection solution with unparalleled accuracy, zero compromise, and its own, integrated CAPTCHA. Using machine learning, DataDome’s real-time solution can identify even the newest and most sophisticated bots in milliseconds. 

Your user experience will also be protected, since only 1 in 10,000 CAPTCHAs might be seen by a customer. (In other words, we have an industry-leading false positive rate of 0.01%.) And in the rare case a real user does see our CAPTCHA, it is user friendly, accessible, and privacy compliant.

See for yourself with a demo.

FAQs

1. Do CAPTCHAs actually work?

Yes and no. While CAPTCHAs alone can help stop very simple bots, they no longer perform their original objective: stopping all bots without creating a negative user experience for humans. Traditional CAPTCHAs are siloed, so they perform without consideration of any other signals besides the pass or fail of the challenge—but other signals are required for rooting out today’s sophisticated bots. CAPTCHAs cannot work to stop bots on their own, and are best when paired with powerful bot detection.

DataDome built the first privacy compliant and user friendly CAPTCHA that exists in perfect synchronicity with unbeatably accurate, real-time bot and online fraud detection. DataDome’s CAPTCHA is only used when the detection solution is uncertain if a user is a human or a bot (based on trillions of aggregate detection signals) and needs further verification.

 2. How does reCAPTCHA work?

ReCAPTCHA, acquired by Google in 2009, is a particular brand of CAPTCHA test. The first version of reCAPTCHA had distorted text and challenged users to decipher and type the text in a field. 

Version 2 of reCAPTCHA is still in use and has a few different sub-versions: no CAPTCHA (user clicks the “I’m not a robot” checkbox) and invisible reCAPTCHA (the “I’m not a robot” checkbox is bound to a different button on the website). Version 3 has no checkbox, instead monitoring on-page user behavior to give users a score—the closer to 0, the more likely the user is a bot.

3. Can CAPTCHAs be bypassed?

Yes, traditional CAPTCHAs can be and often are easily bypassed by bots. Bots have become increasingly able to fake human-like behavior and fingerprints. With reCAPTCHA, they can even achieve a “human” score for version 3, and will not be stopped or challenged. 

Today, many bots that face a CAPTCHA challenge can simply have a human solve it for them using CAPTCHA farms. Additionally, progress in machine learning has enabled some bots to solve CAPTCHAs themselves through ML image or audio recognition.

4. How does a CAPTCHA prevent spam?

CAPTCHAs aim to prevent spam much the same way other stopgaps like honeypots, rate limiting, and WAFs do. Simple bots are generally caught by the filters and cannot solve the challenges, sometimes slowing the bots down enough that the spammer moves to a different target. However, most bots in use today are much too sophisticated to be stopped by any siloed CAPTCHA.