Headless Chrome: Why it’s used and how to detect it.

Scraping Bot management

What is headless Chrome?

Headless Chrome is a regular Chrome web browser without a graphical user interface (GUI), which can only be controlled via a command-line interface or with an automated script.

Most internet users today are familiar with web browsers. We know what they look like and understand that we can use them to access the web. But familiar browsers are not the only way to visit websites or apps. You can also go online with a headless browser.

This article will explain what a headless browser is, why anyone would use one (and it’s not always for good reasons), and how you can browse with one. Most importantly, we will share how you can detect and protect your business and customers against threats coming from headless browsers.

What is a headless browser?

A headless browser looks just like a regular web browser, except that it doesn’t have a GUI—so no tabs, URL bar, input fields, buttons, bookmarks, etc. It’s just the website and nothing else. Because there’s no GUI to navigate, you can only control a headless browser through an automated script or manually through a command-line interface.

Why use a headless browser?

Navigating the web with a headless browser isn’t particularly convenient or easy, so you might wonder why anyone would use it. The biggest benefit of a headless browser is that you can programmatically access content from a web page.

There are a few harmless reasons someone might want to programmatically access web content, for example:

Developers can use headless browsers to write a script that looks for bugs in their websites and apps. The script can be set up to click links, type data into fields, and simulate user behavior—all of which can be done routinely and automatically, greatly speeding up a developer’s workflow.
SEO marketers use headless browsers, Headless Chrome in particular, to understand what the Googlebot sees. It allows them to notice if their website has 404 URLs, broken images, or anything else that looks “off” in a headless browser. That way, they can optimize what Google sees and avoid having the website incur SEO penalties.

But headless browsers can also be used for bad purposes. Attackers might use Headless Chrome, for example, to programmatically scrape website content. Headless browsers are particularly useful for dynamic websites that are more difficult to scrape because the data is locked behind JavaScript elements or forms. A headless browser can be programmed to scrape all content, regardless of the programmatic complexity of a website or app.

Hackers also use headless browsers to fake user numbers, create fake ad impressions, and look for exploitable vulnerabilities in your website, app, and/or API.

How can I use a headless browser?

In 2017, Google’s Chrome developers integrated Headless Chrome into their browser, making it easier to run a Chrome headless browser from the command line. However, many people still prefer to run a headless browser with one of the many headless browser libraries. The most popular libraries include: Puppeteer, Selenium, Playwright, and Splash. Headless browser libraries allow users to automatically:

Crawl Pages
Click Elements
Download Data
Use Proxies
Submit Forms
Fill in Fields

How can I detect a headless browser?

There are many ways to detect whether a request is coming from a headless browser, but how easy detection is depends greatly on how the headless browser is configured. When attackers use headless browsers for web scraping, they do their best to obscure detection, going over all the properties that would usually give a headless browser away—such as navigator.userAgent, navigator.language, navigator.platform, etc.—and trying to make them look like real browser properties.

The best way to stop automated threats coming from headless browsers is to use a specialized bot detection solution that blocks all malicious bot threats, regardless of whether or not they come from a headless browser. (Any automated incoming request that is not on your “allowlist”, likely alongside the Googlebot, has no business browsing your website or app.)

DataDome’s bot and online fraud management solution protects your websites, mobile apps, and APIs from all forms of attack, blocking them within milliseconds of their arrival. Malicious bot requests from headless browsers stand no chance against our award-winning bot detection algorithms and SOC experts.

If you want to know how many bots are currently already browsing your website, start your free trial today. It only takes a few minutes to set up, and you don’t need a credit card. Otherwise, you can always contact us to request a demo of our software.