Web mining is the application of data mining techniques to discover data or patterns from the web (including the mobile apps). It uses automated methods to extract both structured and unstructured data from web pages, link structures and many other publicly available sources on the internet. We have been providing web mining service to 1000+ clients globally. Our mission is to help clients discover useful and compliant information from the web, and use AI based technology to execute the data harvesting process.
Since working on thousands of web mining and web data harvesting projects for almost a decade, we have witnessed how the public web has evolved to become more and more complex in their structures and designs. In the course of web harvesting, people need to repeatedly go through the tiresome process of finding proxies, avoiding CAPTCHAs, solving networking challenges... The depth of field knowledge required to gather public web data is getting deeper and deeper.
Our objective is to help customers design and implement the web mining process so that customers don't have to deal with the complicated websites and the underlying networking protocols by themselves. With our AI based technology, we have successfully built 5000+ datasets for our customers. Every week we are delivering 100M+ rows of data to our customers. We offer three types of services based on the diffrent requests from the customers.
Fully Managed Web Mining and Web Data Harvesting Services
Many customers know little about the web mining technology, and most of them are unfamiliar with the networking protocols. They just need to inform us which piece of data they want to extract from the public web, we will take care of everything else. Customers don't need to worry about any technial details about the process. The entire data harvesting process is fully managed on our side.
Fully managed project starts from only 39$ per month per project.
Some customers perfer to develop web cralwers by themselves. But they don't extract in the right way and can easily get blocked by the web maters. Many people choose to use UI automation tools such as selenium webdrivers/puppeteer/playwright/headless chrome to extract the data, but this can result into 10-100 times more web traffic, whichi will eventually increase the pressure of the target web server. This is something we should really avoid becasue it is extremely unfriendly to the target website in temrs of traffic. Not only it can cause unnecessary traffic burdens to target site, but also eat a lof of cpus and memories of your server.
At barkingdata, we build the cralwer services for customers to harvest data with great ease and efficiency, and most importantly avoidng using selenum/pupeteer/playwright etc. Our crawler service allows us to quickly build a customized dataset for customers, with great efficiency and consistency.
If you are currently extracting some data with webdrivers/puppeteer/playwright, it's always possible for us to convert it to a non-brower fasion on our side and also you can save 50%-80% cost.
Smart Proxy Manager Service
For many generic scraping purposes, customers can use our smart proxy to harvest the data. Our proxy service forwards your requests to a randomly rotating IP address from a pool of proxies with over a million IPs. Customers can also have the flexbility to choose country specific IPs for their extraction task.