From 4eee66fa44fc3e55d2581ade1523690ed108aff9 Mon Sep 17 00:00:00 2001 From: Dazhong Xia Date: Thu, 24 Oct 2024 15:49:57 -0400 Subject: [PATCH] Add questions/objectives/keypoints + first challenge --- config.yaml | 6 +-- .../downloading-files-programmatically.md | 45 +++++++++++++++++++ 2 files changed, 48 insertions(+), 3 deletions(-) create mode 100644 episodes/downloading-files-programmatically.md diff --git a/config.yaml b/config.yaml index c376faf..e848107 100644 --- a/config.yaml +++ b/config.yaml @@ -50,9 +50,9 @@ contact: 'hello@catalyst.coop' # # Example ------------- # -# episodes: -# - introduction.md -# - first-steps.md +episodes: +- introduction.md +- downloading-files-programmatically.md # # learners: # - setup.md diff --git a/episodes/downloading-files-programmatically.md b/episodes/downloading-files-programmatically.md new file mode 100644 index 0000000..e3fc311 --- /dev/null +++ b/episodes/downloading-files-programmatically.md @@ -0,0 +1,45 @@ +--- +title: Downloading files programmatically +teaching: 0 +exercises: 0 +--- + +:::::::: questions + +* How can I get data when there isn't a 'download' link? +* How can I work with data that is behind an API with access restrictions? + +:::::::: + +:::::::: objectives + +* request data from a REST API using `requests` +* authenticate to a REST API using HTTP basic auth, bearer tokens, etc. +* explore directories in AWS cloud buckets +* download data from AWS using pandas + +:::::::: + +:::::::: challenge + +Make a `GET` request to `https://www.catalyst.coop` using `requests`! + +:::: solution + +```python +import requests + +requests.get("https://www.catalyst.coop") +``` + +:::: + +:::::::: + +:::::::: keypoints + +* `pandas.read_*` can read tabular data from remote servers & cloud storage as if it was on your local computer +* `requests` can get data from APIs, though you'll have to do the translation from their response format into `pandas.DataFrame` yourself +* To get access to access-restricted APIs, you will usually pass in an API key, as a request *header* or as a request *parameter*. Both `requests` and `pandas.read_*` have the ability to do this. + +:::::::: \ No newline at end of file