Parse.ly

What is Parse.ly?

Parse.ly’s crawler bot (user agent commonly “ParselyBot”) is a benign web crawler operated by Parse.ly, a content analytics platform (Automattic). It fetches published pages to discover URLs, read metadata (OG, Twitter Cards, Schema.org), resolve canonical links, and build an index that powers real-time analytics, content classification, and recommendation APIs. Typical use cases: validating article metadata at publish time, keeping analytics inventories synchronized, supporting topic/author taxonomy, A/B testing content modules, and powering related-content widgets. For security and ops teams, treat it as an allowlisted bot: it honors robots.txt, uses reasonable crawl rates, and is not an end-user traffic source. Prevent spoofing by validating reverse DNS to Parse.ly-owned domains and corroborating with IP allowlists; apply bot-management policies accordingly.

Why is Parse.ly crawling my site?

It’s likely crawling because a customer of the service referenced or embedded your content, or to validate metadata (titles, canonical tags, authors, publish dates), discover updates, and map relationships between pages for analytics. Potential negatives: incremental crawl load that consumes bandwidth/CPU and churns caches; competition for crawl budget that could delay more critical bots; noise in logs and security telemetry that can skew baselines and trigger false positives in bot-management/WAF rules; inflated pageview-like signals if you rely on naïve server-side metrics; exposure of URL patterns, staging or orphaned pages if discoverable via links or sitemaps; accidental access to dynamic or API endpoints if routing isn’t constrained; and minor cost impact if you pay per request on CDNs or APIs.

Verified Bot

Verified

Robots.txt Compliance

Not respected

Identification Strength

High

Traffic origins

Top 15 countries by bot traffic

US 100.0%

Most used autonomous system (AS)

Top 5 by traffic share

Amazon.com, Inc.

100.0%

Traffic Occupancy

<0.1%

On average, occupy <0.1% of the traffic from bots in the directory

Authorization Rate

100%

Businesses decide to authorize this bot 100% of the time

How to block Parse.ly?

1) User-Agent filtering at the web server
Nginx: if ($http_user_agent ~* "Parse.ly") { return 403; }
Apache:
RewriteEngine On RewriteCond %{HTTP_USER_AGENT} "(?i)Parse.ly" RewriteRule .* - [F]

2) IP/ASN/network blocking
Block known IP ranges or hosting ASNs used by Parse.ly if identified and unwanted.

3) Rate limiting and dynamic banning
Use Nginx limit_req / similar to throttle high-frequency requests from this bot and auto-ban offenders.

4) JavaScript token + honeypot traps
Require a JS-generated signed cookie/token for normal pages and add hidden honeypot URLs; block IPs that fail the JS check or touch honeypots.

TRY FREE

See which bots and AI agents bypass your defenses

Create your account to start analyzing and mitigating malicious bots and AI-drive threats in real-time

Get started

Related Monitoring & Analytics

See all Monitoring & Analytics

Bot Name	Operator	Category
OhDear uptime	Oh Dear BV	Monitoring & Analytics
Splunk Synthetics	Splunk Inc.	Monitoring & Analytics
New Relic	New Relic, Inc.	Monitoring & Analytics
StatusCake	TrafficCake Limited	Monitoring & Analytics
Uptrends	ITRS Group	Monitoring & Analytics