DeepSeekBot is an automated web crawler operated by DeepSeek AI, a company developing large language models (LLMs) and AI-powered applications. Its primary use case is collecting publicly available web content to train and fine-tune AI models, improve natural language processing capabilities, and expand its data corpus. Legitimate uses include indexing public knowledge for AI language model training. However, illegal or fraudulent uses could involve unauthorized scraping of proprietary, copyrighted, or sensitive data, violating website terms of service, intellectual property laws, or causing reputational harm to businesses whose data is harvested without consent.
DeepSeek
What is DeepSeek?
Why is DeepSeek crawling my site?
It is crawling to collect large-scale web content for training AI models. Negative impacts include unauthorized harvesting of proprietary or sensitive content, violation of terms of service, potential copyright infringement, excessive server resource usage, data leakage, and exposure of business logic or confidential information that could be exploited in malicious contexts.
Threat research insights on DeepSeek
All data in this section are produced by DataDome's Galileo Threat Research team from our proprietary detection network and reviewed by human analysts.
Traffic origins
Top 15 countries by bot traffic
Most used autonomous system (AS)
Top 5 by traffic share
On average, occupy <0.1% of the traffic from bots in the directory
Businesses decide to authorize this bot 0% of the time
How to block DeepSeek?
1. Web server firewall (WAF):
• Block requests where the User-Agent header contains DeepSeekBot.
Example (Nginx):
if ($http_user_agent ~* "DeepSeekBot") {
return 403;
}
2. IP-level blocking:
• Identify IP ranges used by DeepSeek and block them via firewall or CDN edge rules.
• Requires continuous IP monitoring.
3. Behavior-based detection:
• Deploy bot management or security solutions that monitor request patterns (high request rates, non-browser behavior) to dynamically block unauthorized crawlers.
4. Application layer controls:
• Implement CAPTCHA or JavaScript challenges to disrupt non-human automated access.
See which bots and AI agents bypass your defenses
Create your account to start analyzing and mitigating malicious bots and AI-drive threats in real-time