A scraper for extracting cell phone information from the website tudocelular.com, written in Python, which extracts data such as brand, model, price, and saves it in JSON files.
-
Clone the repository to your local machine:
git clone https://github.com/Ruy-Araujo/
-
Install the dependencies:
cd tudo_celular pip install -r requirements.txt
-
Run the scraper:
scrapy crawl tudo_celular -o tudo_celular.json
The scraper will extract data from the available cell phones on the tudocelular.com website and save it in a JSON file in the project directory.
The scraper uses the Scrapy framework to parse the HTML of the cell phone technical specification and description pages, extracting information such as brand, model, release year, etc.
The raw data is available here
If you would like to contribute to this project, feel free to open an issue or submit a pull request.