KTB Monthly Scraper

This project extracts monthly tourism statistics for Türkiye and Istanbul from a PDF available on a specified webpage.
The extracted data is structured and ready for further analysis or storage in a database.

Features

Web scraping: Automatically fetches the latest PDF from the specified website.
PDF processing: Extracts relevant data (date, location, and visitor count) from the PDF.
Data structuring: Outputs the extracted information in a structured format as a pandas.DataFrame.

pip install -r requirements.txt

Run the script: Execute the main_check() function to process the latest available data.
Output: The script outputs a structured DataFrame with the following columns:
- tarih (Date)
- ist_tr (Location: "Türkiye" or "İstanbul")
- ziyaretci_sayisi (Visitor Count)

If you wish to install a Python library that isn't in Homebrew,
- Use a Virtual Environment: Using a virtual environment allows you to manage your Python packages independently of the system-wide Python installation. Here’s how to create and activate a virtual environment:

python3.13 -m venv venv 
# Here the path to venv is /Users/user_name/your_folder/ktb-monthly-scraper/venv

source /Users/user_name/your_folder/ktb-monthly-scraper/venv/bin/activate

python3 -m pip install xyz

lsof -i :5432

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
build-f		build-f
dags		dags
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt