Bot Management in the Time of COVID-19: The Challenges of Bot Detection in Extreme Situations

Bot detection is an important, but complex undertaking at the best of times. Keeping up with fast-changing bot technologies and attack strategies requires deep knowledge and continuous threat research, and has become a task for true specialists.

So when black swan events like the COVID-19 pandemic throws everything off track, the challenge becomes monumental. Almost overnight, consumers and cybercriminals alike are changing their online behavior in radical ways. This has profound implications for the accurate detection of automated traffic—and thereby for the security of online businesses everywhere.

Let’s take a closer look at how this extraordinary situation impacts bot detection technologies of various kinds, and what it takes to continue to deliver reliable bot protection in the time of COVID-19.

The limits of rules-based approaches are multiplied.

We’ve written before about the limits of rules-based security solutions for bot detection purposes. WAFs, for example, apply a set of predefined rules to filter out suspicious traffic with familiar attack signatures, but many bots don’t carry attack signatures. On the contrary, they are often designed to imitate human user behavior as closely as possible.

Furthermore, bot operators can easily distribute their bots over hundreds of thousands of different IPs, including residential IPs with excellent reputations, rendering IP-centric detection systems obsolete.

In a crisis, these shortcomings are multiplied. When everything is turned upside down, static rules are even less effective than during normal times.

If you’re currently observing unusual traffic patterns on a website you’re responsible for, you might be tempted to try and at least partly fix the problem by tweaking the rules. But please don’t: you’re probably going to create false positives and upset real customers, who might need you more now than ever. Because as we shall see, even the most sophisticated bot detection solutions are challenged by COVID-19.

Crises challenge machine learning models, too.

Today’s most advanced bot management solutions make extensive use of machine learning to accurately distinguish between human visitors and bots. The idea is to use machine learning models to learn to predict what normal human behavior and activity looks like, and how bots behave differently—even when they are trying to masquerade as humans.

These machine learning models leverage a set of attributes and generic behavioral metrics, such as the number of requests, the average time between consecutive requests, or the way the mouse moves. They usually also include more business-oriented metrics, such as the number of items the visitor has seen or added to the shopping cart.

Learn more: Machine learning in e-commerce: Clean data is better than big data

When a new visitor comes to the site, the models will analyze the user’s metrics, and if these metrics deviate too much from those of other human users, the visitor will be classified as a bot.

However, these bot detection models are being trained on relatively stable sets of data. They learn what’s human and what isn’t during “normal” times — not during crises.

Sure, events such as Black Friday may temporarily change both traffic patterns and individual user behaviors. But these disruptions are mostly predictable, and they don’t last very long before the world returns to normal.

COVID-19 represents an altogether different challenge. It’s a worldwide crisis that will likely last for months, and when it’s over, we may not return to quite the same “normality” as before.

Here are a few examples of possible COVID-related changes to traffic and user behavior patterns:

Travel agency: Immediately after a lockdown announcement, a huge surge of traffic from users cancelling booked travels, then a very sharp decrease in traffic.
Flight comparison website: A drastic decrease in human traffic. Bots now represent the majority of the traffic.
Food retailer: A surge of new online customers trying to book deliveries. Customers also order more goods than usual, which means that they browse more pages. This trend lasts throughout the crisis, which spikes before and after important events such as announcements of new confinement rules.

COVID-related traffic changes on a food retailer website

Figure 1: A French food retailer sees the first sudden change on March 12, when people learnt that the President was going to make an announcement. Since then, the normal activity has been multiplied by 4. The highest spike on the graph is the morning of March 16, when the lockdown started. The red line represents what used to be peak traffic levels before the crisis.

Financial trading platform: A surge of traffic after important public policy announcements that may influence the market.
News website: People spend more time at home and consume more media content.

COVID-related traffic changes on a news site

Figure 2: A news site sees a huge traffic spike the day before a lockdown announcement, probably the result of a push notification about the upcoming statement. Once the lockdown has taken effect, the site also sees a significant increase in daily traffic.

All these changes represent major challenges for bot protection solutions that rely on machine learning models to detect sophisticated bots.

Bot or human? The increased risk of false positives.

In the world of bot management, the main risk related to such sudden, drastic changes is that the machine learning models will incorrectly block human visitors, because these visitors changed their behavior and now look more like bots. This is called false positives.

False positives can occur at the user level, i.e. the machine learning model wrongly classifies an individual human user’s behavior as being that of a bot. However, other kinds of false positives may occur at a higher level, and this is where things can quickly turn ugly.

In addition to analyzing and classifying each request individually, some bot management solutions also have heuristics or machine learning models that detect specific forms of attacks, such as layer 7 DDoS attacks. A DDoS attack is a distributed attack that leverages large numbers of devices to make a very high volume of requests simultaneously, for the purpose of overloading the target website and rendering it unavailable.

Typically, approaches that aim to detect such attacks at the application level will look for a sharp variation in different metrics or combinations of metrics, such as the number of requests per second, and respond to anomalies in pre-defined ways. If the system detects a DDoS attack, for example, it may ask each visitor to solve a Captcha to verify that they are human.

In the case of a false positive, i.e. if an unexpected traffic peak isn’t a DDoS attack after all, just a bunch of freaked-out humans hitting your site all at the same time, this will make for a very bad user experience for said humans.

Note that this phenomenon doesn’t only occur during crises. It can also happen during other kinds of huge events, such as a prime time TV ad campaign that drives a lot of visitors to your website.

How well your bot protection solution will deal with such events depends on different factors, such as:

The time window on which the machine learning models are trained. If a model is trained on a longer time window, it will take longer to react to a sudden change.
The learning frequency of the ML model. If a model is trained every hour, it will be able to capture new human behavior more quickly than a model trained every day.
The availability of a smart feedback loop mechanism.

Most bot management solutions will either show a Captcha to users with suspicious behavior to verify that they’re human, or hard block them for a period of time.

If users are hard blocked, there is no feedback mechanism (other than possibly angry customers on Twitter or Facebook) telling you that the bot detection system has made a mistake. The machine learning model is not … well, learning.

If the user can solve a Captcha to prove her human-ness, on the other hand, the mistake can be taken into account and the algorithm improved. However, not all bot detection systems are able to update their models with the same speed.

The dangers of false negatives:

During extreme situations, there is also an increased risk of letting too many bots pass through to your website. This is called false negatives.

This can be dangerous, in particular if these bots are conducting credential stuffing, a form of attack that consists of trying to access user accounts with stolen credentials.

One of the reasons why false negatives may increase in times of crisis is that if your bot management solution isn’t designed to adjust quickly to sudden changes in user behavior, it might make bad decisions in order to to avoid false positives.

Moreover, in the absence of efficient machine learning models, data analysts might rush to create new allow-list rules and patterns, change different thresholds, or retrain their machine learning models without the necessary information. This, too, can lead to the system authorizing bots that should have been blocked.

Mitigating the effects of sudden, dramatic changes:

While our focus here has been how COVID-19 challenges the machine learning models that are used for bot protection, the same risks apply to bot detection solutions that use heuristics with hardcoded detection thresholds, or thresholds based on statistical analysis.

So how can these challenges be overcome to ensure accurate bot detection, efficient bot protection, and positive human user experiences even in these exceptional circumstances?

Here are some of the ways in which we here at DataDome ensure that our real-time bot protection solution is able to continuously adapt to tumultuous changes:

Our bot detection engine inherently handles feedback using a real-time feedback loop
Our machine learning models are iteratively trained, so that we don’t need to train them fully by batch and they dynamically adapt to new patterns in the data (online machine learning)
The models exploit a wide range of signals (browser, version, OS, user, country, behavior …), which enables us to automatically infer the right features to distinguish between humans and bots, even when human behavior changes
We perform outlier detection to identify anomalies and adjust.

So far, we’re confident that we’re keeping up with our moving targets. As the pandemic unfolds, we will continue to closely monitor our feedback loops, analyze trends, and detect outliers in our global data sets, to ensure that our customers can continue to function as normally as possible in these extraordinary times.