Cabells Stops Theft of Business-Critical Information With DataDome
Cabells is a subscription service designed to help scholars make confident academic journal evaluations and submissions. Research professionals can access lists of vetted journals, as well as tools to identify publishing scams. Bots were scraping the company’s database, reducing customers’ need for the service. Ensuring that only human users can access the data, DataDome protects Cabells’ intellectual property and gives its security staff peace of mind.
The Problem: Bots Were Scraping Business-Critical Information
Scraping is sometimes considered one of the more “innocent” bot threats. After all, scraper bots are just more efficient than humans at taking data that’s there for the taking, right?
Not so fast. When data is a company’s most valuable asset, scraping can be a serious threat to profitability and even survival.
“Our product is a digital database of academic journals,” explains Lucas Toutloff, CTO at Cabells. “We create value by analyzing all these publications to make sure they meet industry standards, and medical and scholarly research professionals use our information to make publishing decisions.”
As Cabells’ digital distribution system evolved from a basic website to more elaborate online tools, the company grew both in terms of volume and customer sophistication. But in parallel, scraping was starting to become a serious issue.
“Some people don’t care about all the details that enrich the data; they just want a binary ‘is this publication on the list or is it not’. If they can easily scrape hundreds of pages of results, they are less interested in paying for a continued subscription. Scraping reduced certain customers’ need for our product,” Lucas elaborates.
“Furthermore, our database contains decades of cumulated knowledge, collated into insights you can’t get anywhere else. Exposed at scale, the entirety of our product could be gone—someone could freely distribute the product of our work. We needed to find a way to limit the access to our data en masse.”
The Solution: Efficient Protection Out of the Box
Putting a stop to the scraping turned out to be a complex task, however. Cabells’ subscribers are academic institutions. Most are IP authenticated, which means that anyone on campus can access the resources.
“In theory, we could go to an organization and tell them ‘somebody here is violating our terms of service, you need to figure out who it is’,” Lucas comments. “But in real life, it’s not realistic, and it wouldn’t be a great customer experience. In addition, some legitimate use cases—going rapidly through big lists, for example—could look like scraping to an unsophisticated system. We needed something more than a pure ‘is this session sending signals’ type of thing.”
While Cabells might come across as a software company from the outside, most of its efforts are focused on evaluating journals and creating the information behind the tools. The company’s digital presence primarily serves that purpose, and its systems were not set up with advanced security or monitoring tools.
Lucas and his team decided that bot protection wasn’t an in-house job, and started to look around for solutions. DataDome quickly stood out as a good fit.
“We liked that it was a kind of overlay that we didn’t have to build, and that we didn’t need to receive alerts and take any action to stop or slow down suspicious visitors,” Lucas comments.”It didn’t require a huge team of experts to get it to work, it was useful out of the box, and that was really attractive to us.”
The Results: Protected Data, Peace of Mind
Since they activated the DataDome protection in 2019, data scraping has no longer been an issue for Cabells.
Thanks to advanced machine learning technologies, DataDome efficiently distinguishes bots from humans, and automatically blocks malicious visitors without intervention from Cabells’ security team: “I haven’t really had to do anything, it’s been ‘set it and forget it’,” Lucas attests.
Because large-scale content theft could have dramatic consequences for Cabells, they have chosen to err on the side of safety with a relatively strict protection mode where all requests are prone to being challenged. While this mode increases the risk of false positives, it ensures that only human users can access the data.
“We’re not beset on all sides by scraping attempts, but it only takes one. Since we implemented DataDome, we can rest easier, knowing the worst-case scenario is that we shut down a legitimate user,” Lucas says. “We’ve set it up so that if there’s a doubt, the user will be blocked and we’ll deal with it later. I really like DataDome’s capacity to tune the protection that way.”
Going forward, the Cabells team hopes to extract even more value from the DataDome solution, as they are migrating to a brand new web architecture.
“The dashboard contains a lot of information about how people are operating, so it’s almost like an analytics tool. As our needs change through our new build, it’s nice to have a solution that is human-understandable, that can grow with us, and that we can grow into,” Lucas concludes.