
fresh over
Highlights
Stars
🕸️ Scrapers
8 repositories
Scrapy, a fast high-level web crawling & scraping framework for Python.
Apache Nutch is an extensible and scalable web crawler
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
Gospider - Fast web spider written in Go