What is DeepSeek?

DeepSeekBot is an automated web crawler operated by DeepSeek AI, a company developing large language models (LLMs) and AI-powered applications. Its primary use case is collecting publicly available web content to train and fine-tune AI models, improve natural language processing capabilities, and expand its data corpus. Legitimate uses include indexing public knowledge for AI language model training. However, illegal or fraudulent uses could involve unauthorized scraping of proprietary, copyrighted, or sensitive data, violating website terms of service, intellectual property laws, or causing reputational harm to businesses whose data is harvested without consent.

Why is DeepSeek crawling my site?

It is crawling to collect large-scale web content for training AI models. Negative impacts include unauthorized harvesting of proprietary or sensitive content, violation of terms of service, potential copyright infringement, excessive server resource usage, data leakage, and exposure of business logic or confidential information that could be exploited in malicious contexts.

Threat research insights on DeepSeek

All data in this section are produced by DataDome's Galileo Threat Research team from our proprietary detection network and reviewed by human analysts.

Verified Bot A verified bot has high identification strength
Not verified
Robots.txt Compliance Whether this bot respects robots.txt directives
Not respected
Identification Strength How confidently DataDome can identify this bot
Low

Traffic origins

Top 15 countries by bot traffic

US US 60.88%
NL NL 10.99%
DE DE 10.97%
CN CN 3.0%
GB GB 2.03%
IT IT 1.88%
ES ES 1.82%
FR FR 1.07%
HK HK 0.92%
RU RU 0.63%
AU AU 0.59%
BE BE 0.55%
NZ NZ 0.52%
CA CA 0.5%
SG SG 0.22%

Most used autonomous system (AS)

Top 5 by traffic share

HostRoyale Technologies Pvt Ltd
19.26%
The Constant Company, LLC
12.18%
WS Telecom Inc
11.9%
NextGenWebs, S.L.
11.38%
HostPapa
8.22%
Traffic Occupancy
<0.1%

On average, occupy <0.1% of the traffic from bots in the directory

Authorization Rate
0%

Businesses decide to authorize this bot 0% of the time

How to block DeepSeek?

1. Web server firewall (WAF):
• Block requests where the User-Agent header contains DeepSeekBot.
Example (Nginx):

if ($http_user_agent ~* "DeepSeekBot") {
return 403;
}

 

2. IP-level blocking:
• Identify IP ranges used by DeepSeek and block them via firewall or CDN edge rules.
• Requires continuous IP monitoring.

 

3. Behavior-based detection:
• Deploy bot management or security solutions that monitor request patterns (high request rates, non-browser behavior) to dynamically block unauthorized crawlers.

 

4. Application layer controls:
• Implement CAPTCHA or JavaScript challenges to disrupt non-human automated access.

DataDome

See which bots and AI agents bypass your defenses

Create your account to start analyzing and mitigating malicious bots and AI-drive threats in real-time