ACHE Crawler!

You will need version 4 or higher of `firefox` https://www.mozilla.org/firefox
to run Pencil as a Firefox Extension. Linux users will need version 4 or higher of either `firefox`, `iceweasel` or `xulrunner` https://developer.mozilla.org/en-US/docs/Mozilla/Projects/XULRunner,
or version 25 or higher of `palemoon` https://www.palemoon.org/.
The Windows installer and OS X archive has everything you need built-in. Windows, Linux, OS X & Firefox Packages are available for download from the Releases Page https://github.com/prikhi/pencil/releases.
You can also install the Firefox Add-on from the Mozilla Add-on Repository https://addons.mozilla.org/en-US/firefox/addon/pencil-prototyping/.
To install the OS X package, unzip the archive and copy the `Pencil.app` folder t

ACHE Crawler alternatives

  • Scrapy

  • Scrapy is an open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

    tags: framework data-mining web-scraping
  • Heritrix

  • Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    tags: web-crawler web-crawling web-data-crawling
  • Mixnode

  • Mixnode is a fast, flexible and massively scalable web crawler in the cloud. Using Mixnode eliminates the need for upfront investment in infrastructure, hardware, software and labour that would be required if you built or ran your own web crawler.

    tags: crawling web-crawler web-crawling web-scraper web-scraping
  • Apache Nutch

  • Apache Nutch --

    tags: web-crawler web-crawling web-scraper
  • StormCrawler

  • StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm. The project is under Apache license v2 and consists of a collection of reusable resources and components, written mostly in Java.

    tags: web-crawler