The difference is that you only pay for the software once, there isn't any monthly billing. It visual scraping feature allows you to define extraction rules just like Octoparse and Parsehub. WebHarvy is a desktop application that can scrape website locally (it runs on your computer, not on a cloud server). Simply run the following : docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia You can run it easily thanks to the docker image. Portia is a web application written in Python. This means it allows to create Scrapy spiders without a single line of code, with a visual tool. It's a visual abstraction layer on top of the great Scrapy framework. Portia is another great open source project from ScrapingHub.
It is by far the most expensive tool on our list ($200/mo for 9000 pages scraped per month).A recipe is a list of steps and rules to scrape a website.įor big websites like Amazon or eBay, you can scrape the search results with a single click, without having to manually click and select the element you want. One of the great thing about dataminer is that there is a public recipe list that you can search to speed up your scraping. It can handle infinite scroll, pagination, custom Javascript execution, all inside your browser. Generally, Chrome extensions are easier to use than a desktop app like Octoparse or Parsehub but lack lots of features.ĭataMiner fits right in the middle. What is unique about DataMiner is that it has a lot of features compared to other extensions. DataMiner is one of the most famous Chrome extensions for web scraping (186k installation and counting).