How CAPTCHA Works: The Technology Behind Bot Protection

There are hardly any users who have never encountered a CAPTCHA on the web. Rather, on the contrary - such encounters happen quite often. Why does it happen?

Studies have shown that about 40% of traffic is bots launched into the network by cybercriminals. They try to gain unauthorized access to websites to steal money or databases.

No website is safe from such actions. Malicious bots are capable of performing a lot of actions - they pick up passwords, redirect transactions, steal personal and valuable information. CAPTCHA is the effective means of protection against these bots.

What is a CAPTCHA?

CAPTCHA is an acronym for Completely Automated Public Turing Test to Tell Computers and Humans Apart. Technically, it is a kind of quick challenge-response type test that allows websites to distinguish humans from bots.

It is generally accepted that a bot is incapable of passing a CAPTCHA, which helps to screen out live users from malware. The first CAPTCHA tests appeared in the late 90s. They were distorted images of random numbers and letters.

How do CAPTCHAs work?

Usually, when CAPTCHA is enabled, a pop-up window appears on the monitor screen with a suggestion to pass the test. This happens when an unauthorized user attempts to access site data or attempts to display information. CAPTCHA text tests showed deliberately distorted numbers and letters, which even a human is sometimes difficult to identify.

Images are further complicated with different color combinations and other distorting elements. This becomes a problem for bots, especially since it is impossible to copy the test text. Later, an updated version of the test appeared - reCAPTCHA from Google. It invites users to find specified objects in the proposed images. There are also “invisible” versions of reCAPTCHA, but they turned out to be little effective. That said, many bot developers today have already managed to outperform all traditional versions of CAPTCHA and reCAPTCHA.

What are the cons of CAPTCHAs?

In today's reality, using traditional versions of CAPTCHA does not provide full protection against bots. They are equally likely to attack a website, mobile app or API. The effectiveness of reCAPTCHA is largely determined by privacy policy, as well as cookies and users with a Google account. Not only that, modern bots are able to pass CAPTCHAs correctly, or utilize CAPTCHA farms to pass the test seamlessly. This means that today's CAPTCHA effectiveness is no longer as effective as it once was, and many bots have the ability to infiltrate a site.

What are CAPTCHAs used for?

The action of CAPTCHA is to ensure access to the site for live users and to stop infiltration attempts for bots. The reasons why we are reluctant to allow bots to infiltrate sites are many:

Creating fake accounts and wasting resources. A lot of opportunities for malicious actions arise, from overloading traffic and servers to sending spam, launching phishing events or denying services to real users.
Posting spam and links to extraneous resources in comments, which can lead to site takeover and partial redirection. Users read comments from bots, see links in them, click on them and often become victims of scammers.
Buying large batches of tickets with subsequent resale at an inflated price. There can be different variants here - tickets to concerts, lectures and other events, airline tickets, etc. As a result, real customers cannot buy tickets at your price and are forced to overpay to fraudsters.
Distortion of online surveys by uncontrolled voting. This makes it possible, for example, to change the rating of products on Amazon to activate their sale.

During the advent of CAPTCHA tests, bots could not cope with the proposed task, and the protection of sites was quite effective. However, modern bots are much more advanced and can handle more complex tasks. Today, they can pass many types of tests and infiltrate websites under the guise of live users.

Types of CAPTCHA & How Different Ones Work

There are several traditional varieties of CAPTCHAs:

Text CAPTCHA

Previously, this was one of the most common types of CAPTCHA that could be seen on any websites. The test asked the user to repeat a word (sometimes a meaningless set of letters) displayed with large distortions. To make the task even more difficult, the text was partially masked by a blurred background.

Such texts had serious flaws and were often criticized. The main problem was the difficulty in reading the distorted text, sometimes preventing the repetition of a word. In addition, the test is impossible for people with poor eyesight.

Image CAPTCHA

This is a variant of the test in which the user is asked to look at several pictures and point out certain objects on them. This way of testing turned out to be more effective due to the inability of bots to recognize and correctly analyze images. At the same time, for humans the task turned out to be simpler, as the pictures are not distorted and the specified objects are clearly visible on them.

Different variants are used. Google works with its library of images from street view cameras. Artificial intelligence is used, generating images of the right type. That is why test queries are of the same type - stairs, crosswalks, fire hydrants, etc. With the help of these tasks the artificial intelligence is trained.

Audio CAPTCHA

Audio CAPTCHA is one of the most accessible variants of the test, allowing most users to cope with the task. To run it, you need to press the button (displayed in the form of a speaker) and listen to a sequence of numbers. In addition, a synthesized voice calls words that begin with the correct letters. When the headphone button is pressed, the user begins to solve the task. As a rule, it includes a set of several digits, which must be entered into the test box in the desired order.

Alternative CAPTCHAs

Many sites today are discontinuing traditional CAPTCHA types and moving to newer, alternative variants of the test. There are quite a few of them already:

A simple math example. The user has to specify its solution as the answer (for example, 2 + 3 =? We specify 5).
A text task in which the user has to specify a word in a sentence (usually the last one), rearrange the letters in a word or enter the required color background.
No CAPTCHA reCAPTCHA. The user is only required to check the “I am not a bot” box.
reCAPTCHA v3. This is the newest version of the test that runs in the background. It finds bots without user interaction.

What triggers a CAPTCHA test?

According to the developers' idea, the CAPTCHA test should be triggered when suspicious behavior is detected by an unauthorized user making attempts to enter the site or performing other undesirable actions. The following situations can be the reasons for login refusal:

IP address verification (it can be perceived as a bot).
Login cutoff. This occurs when a user is not logged into their Google or Gmail account.
Lack of history of logins to the site. Live users will not visit the same page too often. This raises suspicion and is the reason to run the test.

How do CAPTCHAs prevent bots?

CAPTCHA is not a perfect website protection against bots. Attackers and hackers are developing perfect kinds of bots using artificial intelligence. The more complex the CAPTCHA test, the greater the potential for programs capable of self-learning. Traditional versions of the test are no longer effective today.

In the current reality, CAPTCHA is considered the first line of defense against bots. It is not a panacea, and it is no longer possible to eliminate all possible risks with CAPTCHA. It is only one of the possible ways to detect and cut off bots, which needs additional defenses. You can count on CAPTCHA's capabilities only taking into account its capabilities, which are far from ideal.

BotBye’s Reimagined CAPTCHA Integrates With Complete Bot Protection

Traditional versions of CAPTCHAs or reCAPTCHAs are designed to detect and cut off bots from accessing sites. They accomplish their tasks to the best of their ability. However, they also perform negative functions. For example, they reduce site conversion rate (the ratio of the number of users who perform some actions on the site to the total number of visitors). This reduces the number of visitors to the site, which is unacceptable.

FAQs

Do CAPTCHAs actually work?

It is difficult to answer this question unambiguously, because CAPTCHA is a quite working test, but it has a lot of disadvantages. It will stop simple bots without difficulty, but it is not able to cope with more complex ones. Besides, traditional versions of CAPTCHA are inconvenient for users and leave a negative impression on them. The very algorithm of their work is focused on pass/fail tests and is too simple for modern bots. CAPTCHA is not able to stop them on its own and needs additional means of detecting and stopping bots.

How does reCAPTCHA work?

ReCAPTCHA is a newer variation of CAPTCHA that Google acquired in 2009. At first, it didn't differ much from the traditional version. The user had to read the garbled word correctly and enter it into a special box.

Later came version 2 of reCAPTCHA, which required the user only to check the “I am not a robot” box. Then “invisible” reCAPTCHA appeared, where the “I'm not a robot” button was attached to another element of the interface.

Version 3 of reCAPTCHA uses a different algorithm - it analyzes the user's actions on the site and checks how much his behavior coincides with the typical behavior of bots.

Can CAPTCHAs be bypassed?

For modern bots armed with artificial intelligence, bypassing traditional versions of CAPTCHAs is not a problem. They can cope with reCAPTCHA quite easily, imitating not only human behavior, but even generating browser fingerprints.

In addition, modern bots are able to assign test tasks to live users using CAPTCHA farms. The most advanced bots can already scan images and listen to audio on their own, which greatly expands their capabilities.

How does a CAPTCHA prevent spam?

CAPTCHA's ability to detect and remove spam is about the same as other tools of this type (honeypots, WAF or data rate reduction). For simple bots these are quite effective solutions - they get into filters, or the spammer himself switches to other tasks due to too low speed of his bots. The situation with modern bots is much more complicated - CAPTCHA cannot cope with them on its own and needs the help of additional tools.

Back to blog