This project is an ETL (Extract, Transform, Load) exercise to demonstrate data modeling and API development skills. It involves processing event data, defining a data model, and creating APIs.
-
Data Cleansing and Transformation:
- Cleanses venue event data from a CSV file.
- Splits event categories by semicolon (
;
). - Formats time ranges for venue operation hours.
-
API Endpoints:
- Query venues by event category.
- Query venues by day of the week.
- Query venues with events happening at the current time.
- Python 3.10+
- Docker and Docker Compose
Ensure you have both Python and Docker installed on your machine.
-
Clone the Repository:
git clone <repository-url> cd etl_assessment
-
Set Up the Environmen: Create a .env file in the root directory with the following variables:
POSTGRES_DB=your_db_name POSTGRES_USER=your_db_user POSTGRES_PASSWORD=your_db_password
-
Docker Compose:
- The project includes a
Dockerfile
anddocker-compose.yml
for containerizing the FastAPI app and the Postgres database. - To build and run the services, use the following command:
docker-compose up --build
-
Query Businesses by Category
- URL:
/businesses/category/{category_name}
- Method:
GET
- Description: Returns businesses hosting events in the specified category.
- Example:
curl http://localhost:8000/businesses/category/Music
- URL:
-
Query Businesses by Day of the Week
- URL:
/businesses/day/{day_of_week}
- Method:
GET
- Description: Returns businesses that have events scheduled on the specified day.
- Example:
curl http://localhost:8000/businesses/day/Tuesday
- URL:
-
Query Businesses Hosting Events Now
- URL:
/businesses/open-now
- Method:
GET
- Description: Returns businesses hosting events at the current time.
- Example:
curl http://localhost:8000/businesses/open-now
- URL:
-
Business Model:
id
: Primary Keytimezone
: Timezone of the businessrating
: Business ratingmax_rating
: Maximum possible ratingreview_count
: Total reviewsRelationships
:categories
: Relationship to Categoryhours
: Relationship to BusinessHours
-
Category Model:
id
: Primary Keycategory
: Category of the eventbusiness
: ForeignKey to Business
-
BusinessHours Model:
id
: Primary Keybusiness
: ForeignKey to Businessday
: Day of the week (e.g., "Monday")shift1_start
,shift1_end
,shift2_start
,shift2_end
: Operation hours of the business
Once the services are up and running, you can test the API using the following methods:
-
Using cURL: Examples provided in the API section for each endpoint.
-
Swagger UI: FastAPI automatically generates an interactive API documentation. Navigate to the following URL in your browser to test endpoints:
http://localhost:8000/docs
-
Postman: You can also import the endpoints into Postman and test the API.
- Implement error handling for more edge cases (e.g., CSV format changes).
- Expand API capabilities to include filtering by multiple categories and flexible time ranges.
- Add tests.