Homework assignment for Data Source API Analyst role.
/Content
: API documentation, Python code, and troubleshooting guidesapi_documentation.md
: Detailed documentation of the GitHub API endpoints usedgithub_api_client.py
: Python implementation of the GitHub API client using requeststroubleshooting_guide.md
: Guides for troubleshooting and data-cleaning approach
/Postman_Collection
: Contains the Google Colab notebookgithub_api_test.ipynb
: Notebook with API testing implementation
This solution uses GitHub API to search public repositories, get commits information and get repository contents
requests, pandas and logging libraries are used to interact with API. Key features are:
- Rate limit handling
- Error handling
- Pagination support using links headers
- Data extraction and cleaning
- Results are stored in JSON
- Logging system
- Results display
Many points of the assignment seemed ambiguous and required extra time to understand what specific outcome is expected