This is a lightweight yet powerful project designed to monitor websites for changes. It leverages Python and Selenium to check for updates and includes a GitHub Actions workflow that automates the process on a defined schedule.
I use this project to track changes on my university portal for grade updates, but it can be adapted to monitor any website where tracking changes is important.
- Monitors a specified website for changes.
- Sends email notifications when changes are detected.
- Includes a GitHub Actions workflow to automate the monitoring process on a schedule.
- Flexible and easy to set up for your own use cases.
- Monitoring a university portal for grade updates.
- Tracking updates to a blog or news page.
- Keeping an eye on price changes for a product.
- Monitoring any webpage for changes in content.
- You have Python installed.
- You have one of the following email-addresses: gmail, outlook, icloud, yahoo. If you don't, use step 10 to add your provider.
This project works without any need for change for all HSBI students who use LSF to recieve their marks, in this case you would need to just follow the steps, otherwise you will need to change the part where you access the website.
- Click the Fork button at the top-right corner of this repository to create your own copy.
Open the terminal in your IDE and run:
git clone https://github.com/your-username/website-change-monitor.git
cd website-change-monitor
This file will be generated automatically if deleted, but it will return you an error if is encryped with another key and will not be regenerated. So just delete the file, but leave content_file.txt.
- Navigate to Settings > Environments in your forked repository.
- Click New environment and name it
Test
.
Now you will need to generate a secret to encrypt the file with the website updates to that it's not visible in the internet.
- Create and open a venv:
- For Windows (CMD):
python -m venv venv venv\Scripts\activate
- For Mac/Linux:
python3 -m venv venv source venv/bin/activate
- For Windows (CMD):
-
Install dns to be able to send emails:
pip3 install dnspython
- Generate the encryption secret. For that, istall the cryptography-library:
pip3 install cryptography
And paste this code to a separate file and run it:
from cryptography.fernet import Fernet key = Fernet.generate_key() print(key.decode())
or just comment out these lines from the end of the script.py. - Copy the generated code from the console to use it in the step 4.
- Navigate to Settings > Environments > Test > Add environment secret.
- Add the following secrets:
Secret Name Description EMAIL
Your email address EMAIL_PASSWORD
Password or app-specific password for your email LOGIN_USERNAME
Username for the website to monitor LOGIN_PASSWORD
Password for the website to monitor ENCRYPTION_KEY
Key, generated in Step 1 - The workflow will start atomatically and the script will be run every 20 Minutes (you can find the runs under Actions), the time can be changed at .github/workflows/run_script.yml, line 5.
It is needed to update content_file.txt every run. For that, go to Settings -> Actions -> General -> Scroll to the bottom to 'Workflow permissions' and select 'Read and write permissions'.
- Go to the Actions tab in your repository.
- Select the Run Python Script Workflow at the top left and click Run workflow to start the monitoring process. Without it, the process will still start, but you can run it manually.
Automatically, the recepient email is the same email you use to send it, but you can change the recepient under script.py, line 32 - TO_EMAIL variable.
- Create a Virtual Environment: Run
venv create
to set up a virtual environment. - Install Dependencies: Import the required packages by running
requirements.txt
. - Add a
.env
File: Create a file named.env
and add your secrets in the following format:EMAIL='[email protected]' EMAIL_PASSWORD='password from email address to send emails' LOGIN_USERNAME='login from the website' LOGIN_PASSWORD='password from the website'
- Run the Script: Execute the script to test its functionality.
- The encryption will not be used and you will see the content in content_file.txt, which will be compared later, so be carefull if you push it to public repository.
- Adapt to Your Needs: Modify the script according to the documentation provided below.
- Add a new email provider to SMTP_PROVIDERS constant at script.py (line 35); use the same structure as other providers.
- In the method get_smtp_settings, add your email to available providers by adding your provider to the if-else-structure in lines 121-128.
- Feel free to merge and push your changes to this repository!
The main script, script.py
, orchestrates the entire process. Here's an outline of its functionality:
-
Fetch Website Content
- The script logs into a website and retrieves the required elements.
- For websites requiring authentication, login steps are included. If you do not need a login or any steps, you can remove lines 50–71 in the script.
- Customize the steps based on your needs. You can add more URLs, steps, fields, or anything necessary to extract the desired data. Remove 'options=options' from line 42 to remove headless mode and allow to see the steps if needed.
- For element selection, the script uses XPath. To find an XPath:
- Open developer tools in your browser (F12).
- Select the desired element.
- Right-click and choose "Copy > XPath".
- Update
XPATH_TRACK_AREA
to match the parent element containing the information you want to track (e.g., a price field, a feed, or a table with university marks). - Adjust
.github/workflows/run_script.yml
for your requirements:- Modify check intervals (line 5).
- Update the GitHub environment name (line 11).
- Configure environment variables (lines 79-82). If login isn't needed, remove those variables or add new ones if required.
- Keep
.env
listed in.gitignore
to prevent exposing sensitive data.
-
Compare Content
- The script compares the current website content with the previous content stored locally.
-
Save Content
- Any changes are saved to
content_file.txt
. The format it outerHTML of the parent element (XPATH_TRACK_AREA
).
- Any changes are saved to
Contributions are welcome! If you have ideas for improvements, feel free to:
- Fork the repository.
- Create a new branch for your changes.
- Submit a pull request.
This project is open-source and available under the MIT License.
Feel free to adapt this project to your own needs and use it for tracking changes on your favorite websites! If you find it helpful or have suggestions, I'd love to hear from you. 😊