Code for scraping communities and other interesting projects from public websites.
- data: extracted datasets
- project_scraper: Scrapy project for data harvesting
- Make sure you have GIT and GIT-LFS installed.
- Clone this repo:
git clone https://github.com/dsrp/project-scraper.git
. - Make sure you have pipenv available;
pip install pipenv
- Clone repo:
git clone https://github.com/dsrp/project-scraper.git
and change into project directory. - Locally install dependencies:
pipenv install
pipenv shell
scrapy crawl <fic|gen>