How KuantoKusta Stopped Price Scraping With DataDome

No more content scraping
User data is secure
Fast & stable website
DataDome
Table of contents
24 Feb, 2020
|
min
KuantoKusta is Portugal’s leading price comparison site. Created in 2004, the site has established itself as the front-runner in its sector. Today, the three million unique users who visit KuantoKusta each month can compare the prices of over two million products from 700 different stores.

In 2015, KuantoKusta created PriceBench, a specialized tool for collecting, processing and analyzing product data that makes the price management process smarter. PriceBench allows stores to monitor the prices of their direct competitors so they can adjust their own prices, and to detect new products offered by competitors.

The KuantoKusta Supermercados platform, launched in 2016, aims to help users save time and money and to better manage their family budget by allowing them to compare the prices of their favorite products in nearby supermarkets.

In 2018, KuantoKusta also became a marketplace. The company now allows its users not only to compare the prices of the products they are looking for, but also to purchase them. In a single transaction and with one single payment, they can buy a cell phone, a pair of shoes and a perfume.

The Problem: Cleaning up traffic & stopping the leakage of price data.

“The first sign that we had bots on our site was the rather rapid updating of certain merchants’ prices,” recalls Paulo Pimenta, founder and CEO of KuantoKusta.

“Whenever one of their competitors reduced a price to 99 Euros, these merchants immediately offered the same product at 98.99 Euros. It was too automatic, and we figured it was impossible for humans to monitor thousands of prices in real time. These merchants had to be using bots.”

At the same time, KuantoKusta was preparing to launch its PriceBench solution, which offers a competitive intelligence service to the merchants represented on the site.

“We understood that unauthorized scraping jeopardized the success of our new solution,” explains Paulo. “PriceBench being a paid solution, we would not be able to sell it if merchants could simply send bots to extract prices directly from the site.”

In addition, the overly automated price updates generated embarrassing messes: If a merchant made a data entry error when changing their price, the error also spread among the copycats, resulting in consumer dissatisfaction and complaints.

To better understand what was going on, the team examined the site logs. The logs did indeed reveal a very large volume of automated traffic, and therefore an additional concern:

“It alerted us to the problem of bandwidth consumption,” says Paulo. “At times, often at the same time of day, the site became much slower for no apparent reason. We discovered that it was due to bots that came at specific times and overloaded the site.”

At the time, KuantoKusta’s architecture wasn’t as optimized as today: the database was not disconnected from the site, and the traffic spikes caused by the bots considerably slowed down the site, especially when they occurred during peak hours.

Like many other companies, KuantoKusta first tried to solve the problem by developing an in-house solution. This effectively blocked less sophisticated bots, for example those that generated too many requests per minute or hour from a single IP address. Unfortunately, because this basic detection mechanism was based solely on the IP, it generated false positives.

“To give an example,” Paulo smiles, “the Portuguese police implemented a VPN. Every police station in the country therefore had the same public IP address. If 50 police officers spent their lunch break comparing prices on KuantoKusta, the request threshold was reached and they were blocked.”

Fortunately for the peace officers’ family budgets, their IP address could be manually unblocked. Nevertheless, Paulo and his team concluded that it was time to bring in experts.

 

The Solution: Effective, low-latency, self-managed protection.

In 2015, anti-bot solutions were not all that common. KuantoKusta initially partnered with the American leader in this emerging market. The new solution was blocking unwanted traffic efficiently enough, but the team soon faced a series of technical limitations.

“The architecture of this first solution required us to redirect our DNS to the supplier, which increased the loading time of our pages by more than a second,” explains Paulo. “And in e-commerce, when it comes to improving conversion rates, every millisecond counts.”

“Plus,” he continues, “we couldn’t manage the solution independently. Whenever I wanted to change something, I had to send an email asking the supplier to intervene. As their technical support was in another time zone, and they were not available on Saturdays, it was too complicated.”

For their second attempt, therefore, the specifications were clear: In addition to effectively protecting the site, the anti-bot solution should not affect the site’s performance, and KuantoKusta needed to keep control over the traffic. DataDome matched these criteria perfectly.

“After the first experience, we wanted to be sure that the new solution was technically up to date,” says Paulo. “DataDome was still a young company, but the technology seemed robust, and the installation was really very simple. One of my requirements was that I didn’t want to make any changes to our site, or to have to go through a complicated integration process.”

Promise kept: With a small technical team, assisted by the DataDome Customer Success team, the implementation and configuration took only a few hours.

As advised by DataDome, the protection was not activated for the first two weeks after installation. This period of simple observation, where all traffic is allowed to pass, helps establish a baseline to better understand the nature of the traffic.

“We were very surprised at the number of different bots that crawled the site,” says Paulo. “I knew there were many, but seeing thousands a day was quite an eye-opener!”

While some bots made excessive demands on the site and consumed a lot of bandwidth, others only discreetly scanned 10 or 50 pages. The team also found a surprising number of foreign bots, including Chinese, that they suspect came from competitors using foreign services or IPs.

“After this analysis phase, helped by DataDome’s advice and experience, we created our allow-list and defined attack responses for other bots. Since they are clearly identified in the dashboard, we were able to start managing the traffic ourselves very quickly,” concludes Paulo.

 

The Results: Data security & stable performance.

Before the protection was activated, bots represented around 70% of the total traffic on KuantoKusta.pt. Today, unwanted bots are efficiently blocked by DataDome. As a result, many bots have simply abandoned the site, so that unwanted access attempts have greatly decreased.

“As soon as the protection was activated, certain merchants suddenly became interested in PriceBench,” chuckles Paulo. “Without telling us that their methods no longer worked or that their bots were being blocked, all of a sudden they requested the service on their own initiative. The correlation between the activated protection and their decision to subscribe to PriceBench was pretty obvious.”

What other benefits does Kuantokusta gain from the DataDome protection?

“The main advantage of DataDome is to protect price data, which is an essential asset for us,” says Paulo. “This is the main reason for having an anti-bot solution: to prevent bots from stealing the fruits of our daily work.”

“With the introduction of our Marketplace, it has also become important to secure our users’ personal data,” he continues. “We now manage login data and information on what people are buying, and to comply with the GDPR, we have to protect this data. DataDome won’t stop a human from trying to force it, but we don’t have to worry anymore about hackers who launch bots to try to steal passwords.”

“And finally, we no longer have traffic spikes that slow down or crash the site. Before DataDome, the site was sometimes 2, 3, or 4 seconds slower for no apparent reason. It was only after the fact that we could see that some IP addresses had consumed a lot of bandwidth. Today, these peaks wouldn’t have the same impact since the database is now disconnected from the site, but to optimize performance, it’s still important to eliminate unwanted traffic.”

In closing, does KuantoKusta’s founder have a favorite feature?

“In addition to knowing exactly what is happening on my site, what I particularly appreciate is being able to make decisions myself,” he concludes. “I can easily allow-list IPs, change the rules for a domain… in short, maintain control, without having to ask DataDome to activate this or change that. For example, if we change our ISP and our IP address changes, I can very easily allow-list the new address myself.”

In practice, however, Paulo spends little time on the DataDome dashboard.

“When all is well, I don’t need it! I receive a daily report by email, and if something intrigues me, I can go take a closer look. But the less I have to deal with bots, the better. If I don’t stick my nose in it, it’s because it’s working. And that’s exactly what we’re looking for in our partners: solid teamwork at the outset, whereafter the tools can run on their own.”

DataDome
dd product home overview

Still exploring?

Start with an on-demand demo.