AI bots can be blocked by adding their user-agent name to the disallow directive in the robots.txt file.

What Happens If I Don’t Have a Robots.txt?

Search engine web crawlers will index every page on your site. This can result in irrelevant content being indexed which can negatively impact your page rankings.

What is the Difference Between Robots.txt and Meta Tags?

Robots.txt controls access to your site at a directory level. Meta tags manage crawling and indexing behavior for individual pages.

Signed, Sealed, and Delivered: The Case for Authenticating AI Agents

Bot management Agentic AI Cyberfraud

Context

Long before the rise of large language models (LLMs), a multitude of bot services were employed to crawl websites for a variety of beneficial purposes, including data collection, business intelligence, and monitoring service availability.

There are times when you’ll want to authorize this kind of automated traffic. The main challenge is authenticating the traffic. If you don’t, malicious actors could easily mimic the bot service and gain unauthorized access. With this in mind, we can define a ‘verified bot’ as follows:

The bot’s requests are automated but non-malicious. It should not engage in harmful activities like Distributed Denial of Service (DDoS) attacks, scalping, account takeovers (ATOs), intensive scraping, or the exploitation of vulnerabilities.
It has a clear, helpful purpose. The bot’s primary function is to assist humans with a specific, well-defined task.
It uses a verifiable authentication method. Its requests are authenticated through a system that is used exclusively for the bot’s service.
Its operations are transparent. The bot’s purpose, behavior, and origin are publicly documented and regularly updated.

This blog will explore how the rise of agentic AI is changing the landscape of bot authentication. We’ll start by reviewing common legacy authentication methods before delving into the modern standard that should be adopted today.

What makes agentic AI a game-changer for bot authentication?

As the number of AI agents continues to grow, verifying a bot service has become more crucial than ever. The core purpose of these AI agents is to act on behalf of humans, assisting us with complex and tedious tasks. For example, they can be used to find and book a hotel, compare various websites to find the perfect item, and more.

AI agents present a unique challenge because their operational model (running their own browser or acting as a user-level extension) makes them appear as highly sophisticated threats. Without a reliable authentication mechanism, they are virtually indistinguishable from the smartest malicious bots.

This highlights a fundamental difference in motivation: legitimate AI agents have a verifiable and helpful purpose, which should incentivize them to authenticate their requests. Conversely, malicious bots will always evade identification to maintain their illicit operations.

Traditional (and flawed) authentication

The vast majority of bot services can be identified by their user agent. However, since a user agent is just a simple HTTP header, it can be easily spoofed. This method is sufficient for blocking unwanted traffic, but it is entirely inadequate for authorizing it.

In short, a user agent can identify a bot service, but it cannot authenticate it. To use an analogy, it’s like someone calling you on the phone and claiming to be the CEO, but having no way to prove their identity.

Traditional (and better) authentication

More restrictive and reliable methods have been widely adopted, including:

IP Allowlisting: Limiting traffic to a fixed list of authorized source IP addresses.
Reverse DNS Check: Verifying that a reverse DNS pointer is configured on the originating IP addresses.

For this approach to work, the bot service must:

Publicly document its IP addresses and/or reverse DNS records.
Maintain the accuracy of this documentation.
Ensure these IPs are used exclusively for the bot service, to prevent their use by malicious actors.

Here’s where these traditional methods fall short. They are incredibly limiting for the bot service, and sometimes, they just don’t work at all. A bot service might be using a cloud provider, which means it doesn’t have control over its IP addresses. These IPs can change at any time or be shared with other services. What’s more, an AI agent running inside a user’s browser will have an unpredictable originating IP address, making this authentication method completely unworkable.

To sum up, while IP and reverse DNS allowlisting are effective, they are not scalable. This is analogous to authenticating someone by their proof of address, but at a national scale. Imagine a country trying to produce a real-time, accurate list of every citizen’s proof of address: it’s simply not a feasible model.

The state-of-the-art: Following OpenAI’s agent path

We’ve all been using modern cryptography for decades, often without even realizing it. The most common example is HTTPS.

The Transport Layer Security (TLS) protocol is what allows a client to authenticate the server it’s connecting to. The process is straightforward:

The client sends a cryptographic challenge to the server.
The server responds with a signed response.
The client verifies the response and its signature. If valid, an encrypted channel is established for all subsequent requests.

A bot service can use this same cryptographic principle to authenticate itself. It can sign every request it makes, and the website can verify that signature. This approach requires the following:

For the bot service:

Generate a key pair, consisting of one private key and one public key.
Sign each request with its private key.
Provide a public path for others to discover its public key.

For the website:

Obtain the bot’s public key.
Collect the signature that accompanies each request.
Use the public key to verify the request’s signature. If the signature is valid, the request is processed, otherwise it is blocked.

Source: https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture

This method eliminates all of the drawbacks of the previous approaches. It is a scalable, state-of-the-art authentication model that works regardless of a bot’s infrastructure or IP address.

How one bot service is doing it: The OpenAI agent

OpenAI is one of the first AI agent services to embrace this new standard. On July 25th, they announced that their requests would be authenticated using cryptographic signatures. This immediately solves two of the key requirements we just discussed.

The final requirement for a bot service is a method for publishing its public key. This is a practice that OpenAI is helping to define in the field. To address this, the IETF is proposing an RFC that provides a standardized method for public key discovery, and OpenAI’s Agent service is adopting this standard.

This initiative can be abstracted as follows: the bot service places its public key in

/.well-known/http-message-signatures-directory under a domain it owns.

You might recognize this /.well-known/ path; it’s the same method used by Let’s Encrypt to generate a TLS certificate. This similarity is no coincidence, as that process is also based on cryptographic signatures. Once again, these well-established mechanisms are proving to be the most reliable way to authenticate an entity on the internet.

On the website side

The new RFC offers a standard way for a website to get the bot service’s public key, either through the web server itself or via an edge service like an anti-fraud platform (Origin on the figure below).

The second piece is collecting the signature of the request, which the RFC specifies should be sent via HTTP headers. To simplify, these headers have three roles:

Signature-Agent: This header announces which agent the signature is for.
Signature-Input: Signing the entire request body would be inefficient. This header defines the specific parts of the request to be signed and includes anti-replay safeguards like a nonce, a validity window, and a unique request identifier.
Signature: This is the actual cryptographic signature, encoded in Base64.

Source: https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture

A critical security point regarding the Signature-Agent header: for the protocol to remain secure, you should never blindly trust the value in this header to fetch the public key. The public key must be retrieved from a separate, trusted channel to prevent a malicious actor from impersonating a legitimate bot.

Detecting imposters

To put it simply, signed requests provide a strong, well-established, and scalable form of authentication. It’s like a nation issuing a unique, signed certificate to every citizen. When a citizen needs to prove their identity to an authority, they simply present their certificate. The authority can quickly scan and verify its authenticity.

The diagram below depicts the request signature mechanism with a fraudster who mimics OpenAI agent.

Lacking access to the genuine ChatGPT private key, a fraudster attempts to forge a request. They construct a request that claims to be the ChatGPT Agent in the Signature-Agent header and prepares the other headers as if they were a legitimate bot. However, they cannot produce a valid signature.

When this request reaches a service like DataDome, we first note its claim of origin from the OpenAI Agent. Following the principle of “trust, but verify”, we immediately begin the authentication process. We then:

Retrieve the official ChatGPT Agent public key.
Verify that all the information in the Signature-Input is valid.
Attempt to validate the signature using the genuine ChatGPT public key.

Because the request lacks a valid signature, the cryptographic verification fails immediately. The signature is proven to be invalid, and the fraudulent request is successfully blocked.

DataDome now supports Web Bot Auth

It’s 2025, and the era of outdated, easily-spoofed authentication methods must come to an end. It’s time to fully embrace the scalable, sustainable, and bulletproof solutions that modern cryptography has long provided.

DataDome is leading this shift. Our Bot Protect solution now supports Web Bot Auth verification for all customers—with zero setup required. We automatically validate cryptographic signatures from AI agents following the IETF standard, combining unforgeable identity verification with intent-based detection to ensure authenticated agents behave legitimately throughout the customer journey.

Ultimately, while cryptographic authentication provides a robust way to verify a bot’s identity, traditional intent detection mechanisms are still a crucial layer for identifying and preventing abuse of an AI service.