Web scraper for https://trademe.co.nz/property
- Python v3.9.13
- Flask
- Docker
- Helm chart & Kubernetes
To get a local copy up and running, follow these simple example steps.
Python v3.9.13
Docker (optional)
Helm chart and Kubernetes (optional)
# create a virtual environment
# assuming you have "python3 --version" = "Python 3.9.13" installed in the current terminal session
python3 -m venv ./venv
# activate virtual environment
# for macos or linux
source ./venv/bin/activate
# for windows
.\venv\Scripts\activate
# upgrade pip
python -m pip install --upgrade pip
# install python dependencies
pip install -r requirements.txt
# lint python code
pylint ./search
- Run as cli:
# argument 1 = city
# argument 2 = total number of pages
# argument 3 = true or false (default) to do search 'with detail' or 'without detail'
# argument 4 = output file (default = result.psv)
# argument 5 = true or false (default) to enable debug mode logging
# e.g. python -m search.main <city> <total pages> <true or false> <file.psv> <true or false>
python -m search.main auckland 1 false result.psv
- Or, run as web api server:
FLASK_ENV=development python3 -m search.app
- Or, build and run in docker container:
# build docker image
docker build -t webscrappe-trademe-co-nz-property:mvp .
# run docker container
docker run -d -p 8080:8080 --name webscrappe webscrappe-trademe-co-nz-property:mvp
# stop and remove docker container
docker stop webscrappe
docker rm webscrappe
- Or, run in kubernetes cluster:
# upgrade or install helm chart, if not preset
cd .deploy/helm
helm upgrade -i webscrappe-trademe-co-nz-property webscrappe-trademe-co-nz-property \
-n webscrappe --create-namespace
# stop and remove helm chart and namespace
helm uninstall webscrappe-trademe-co-nz-property -n webscrappe
kubectl delete namespace webscrappe
When running as a command line interface (cli):
# argument 1 = city
# argument 2 = total number of pages
# argument 3 = true or false (default) to do search 'with detail' or 'without detail'
# argument 4 = output file (default = result.psv)
# argument 5 = true or false (default) to enable debug mode logging
# e.g. python -m search.main <city> <total pages> <true or false> <file.psv> <true or false>
python -m search.main auckland 1 false result.psv
NOTE: the output result.psv file needs to be imported with a custom Delimiter or Separator type - |
in the import CSV wizard.
When running as an api, use the following endpoints:
- 'http://
{domain name}:{port}
/search-without-detail/{city}
/{total number of pages}
' searching without property detail.
# example
curl http://localhost:8080/search-without-detail/auckland/1
- 'http://
{domain name}:{port}
/search-with-detail/{city}
/{total number of pages}
' searching with property detail.
# example
curl http://localhost:8080/search-with-detail/auckland/1
An example property search page lists property as follows: The fields in PSV are mapped as follows:
- title
- address
- number_of_bedrooms
- number_of_bathrooms
- number_of_parking_lots
- number_of_living_areas
- floor_area_sqm
- land_area_sqm
- asking_price
Another example property detail page shows property as follows: The fields in PSV are mapped as follows:
Another example property detail page shows property as follows: More fields in PSV are mapped as follows:
# run unit tests
pytest -v --cov=search
👤 Ankur Soni
Give a ⭐️ if you like this project!
This project is MIT licensed.