shine-jobs-scraping

Data Science Jobs Scraper

Overview

This script automates the process of scraping job listings from Shine.com for data science positions. It navigates through the job listings, extracting detailed information including job title, company name, experience required, job type, and job description.

Features

Targets recent job listings (posted within the last 7 days).
Handles pagination up to a specified number of pages.
Utilizes Selenium for dynamic content extraction to ensure accuracy.
Detailed logging of operations and errors.

Requirements

To install the required packages, run the following command: pip install -r requirements.txt

markdown Copy

Configuration

Edit the config.ini file to specify the maximum number of pages (MaxPages) and the path to your Selenium WebDriver (DriverPath).

Example:

[DEFAULT] MaxPages = 10 DriverPath = path/to/your/chromedriver.exe

bash Copy

Running the Script

To run the script, use the following command: python scrape_jobs.py

Ensure that Python and all required packages are installed, and that you are in the directory containing the script.

Logging

Errors and information are logged to BharadwajKamepalli_Errors.log, which includes details about URL accesses, data extraction issues, and other runtime events.

Limitations

The script is dependent on the structure of the website. Changes to the website may require updates to the script.
Designed to run on websites without advanced anti-bot protections.

Author

Bharadwaj Kamepalli

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BharadwajKamepalli_Errors.log		BharadwajKamepalli_Errors.log
BharadwajKamepalli_shine_Output_page_1.csv		BharadwajKamepalli_shine_Output_page_1.csv
BharadwajKamepalli_shine_Output_page_10.csv		BharadwajKamepalli_shine_Output_page_10.csv
BharadwajKamepalli_shine_Output_page_2.csv		BharadwajKamepalli_shine_Output_page_2.csv
BharadwajKamepalli_shine_Output_page_7.csv		BharadwajKamepalli_shine_Output_page_7.csv
BharadwajKamepalli_shine_Output_page_8.csv		BharadwajKamepalli_shine_Output_page_8.csv
LICENSE		LICENSE
README.md		README.md
config.ini		config.ini
image.png		image.png
requirements.txt		requirements.txt
scrape_script.py		scrape_script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shine-jobs-scraping

Data Science Jobs Scraper

Overview

Features

Requirements

Configuration

Running the Script

Logging

Limitations

Author

About

Releases

Packages

Languages

License

bharadwaj008/shine-jobs-scraping

Folders and files

Latest commit

History

Repository files navigation

shine-jobs-scraping

Data Science Jobs Scraper

Overview

Features

Requirements

Configuration

Running the Script

Logging

Limitations

Author

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages