Skip to content

sfc-gh-cfraleigh/data-engineering-interview-cfraleigh

 
 

Repository files navigation

data-engineering-interview

2023 Data Engineering Interview

Task: Retrieve and process data from the RandomUser API

Instructions:

  1. You are required to write a Python script that retrieves data from the RandomUser API.
  2. The API you will be working with is the RandomUser API (https://randomuser.me/). It provides a RESTful interface for generating random user data.
  3. The API documentation can be found at https://randomuser.me/documentation.
  4. Your script should retrieve the data, process it, and store it in a local file in a suitable format.
  5. You should handle any necessary error checking and exception handling.
  6. Your script should be well-structured, modular, and include appropriate comments.
  7. Use any Python libraries or frameworks you deem necessary to accomplish the task.

Requirements:

  1. Your script should retrieve a list of random users from the API.
  2. For each user, extract the following information:
    • First name
    • Last name
    • Gender
    • Email address
    • Date of birth
    • Phone number
    • Nationality
  3. Store the extracted information in a local file in a suitable format (e.g., CSV, JSON, Parquet, etc.).
  4. Your script should handle pagination if the API response is paginated.
  5. Create a separate Markdown file that includes:
    • Overview
    • Setup Details & Instructions
    • Known Limitations or Inefficiencies

Evaluation Criteria:

  1. Ability to interact with a public API and retrieve data.
  2. Correct extraction and processing of the required information.
  3. Proper error handling and exception management.
  4. Suitable storage of the data in a local file.
  5. Code structure, organization, and comments.
  6. Efficient and effective handling of pagination (if applicable).
  7. Overall code quality, readability, and adherence to best practices.
  8. Extensibility of the code kept in mind.

Submission Guidelines:

  1. Fork this repository.
  2. Create a new branch in your forked repository to work on the assignment.
  3. Commit your code changes to the branch.
  4. Include any necessary instructions or setup details in a separate Markdown file.
  5. Once you have completed the task, submit a pull request from your branch to the main repository.
  6. You have five (5) full business days to work on this (e.g. Receive on: Monday@12pm -> Submit by: Monday@12pm)

Some Notes on Time:

  • You're free to use as much time as you deem necessary to work on this within the assignment window. Please note, however, a marquee skill in software engineering is being aware of diminishing marginal returns, i.e. knowing when to stop.
  • Additionally, please note that the amount of time spent working on this does not necessarily correlate with an increased probability of moving on in the interview process. Your work will be judged on the criteria listed above.

About

2023 Data Engineering Interview Chris Fraleigh

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.1%
  • Makefile 7.2%
  • Dockerfile 2.7%