Skip to content

Commit

Permalink
Add incomplete files and visualization lesson
Browse files Browse the repository at this point in the history
  • Loading branch information
rlbarter committed Jan 30, 2024
1 parent dcafb39 commit c0117af
Show file tree
Hide file tree
Showing 34 changed files with 4,271 additions and 256 deletions.
Empty file added .Rhistory
Empty file.
8 changes: 8 additions & 0 deletions content/complete/01_variables.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Simple computations\n",
"\n",
"We can use Python to do simple computations, like this:"
]
},
Expand Down Expand Up @@ -41,6 +43,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Defining variables/objects\n",
"\n",
"If I want to use the \"output\" of this code, we need to assign it to a variable/object."
]
},
Expand Down Expand Up @@ -166,6 +170,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Overwriting variables\n",
"\n",
"You can overwrite variables, by re-assinging them:"
]
},
Expand Down Expand Up @@ -204,6 +210,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The `+=` shortcut\n",
"\n",
"There is a shortcut that will let you add a number to a variable *and* update its value: `+=`"
]
},
Expand Down
2 changes: 2 additions & 0 deletions content/complete/02_types.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The `type()` function\n",
"\n",
"We can check the type of `y` using the `type()` funciton"
]
},
Expand Down
6 changes: 6 additions & 0 deletions content/complete/03_type_conversions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Converting to a string using `str()`\n",
"\n",
"The `str()` function will convert whatever value it is given to a string (whose shorthand is `str`). \n",
"\n",
"Below, we convert the integer `4` to a string, assign it to a variable called `a` and then we check the type of `a` (which is `str`):"
Expand Down Expand Up @@ -130,6 +132,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Converting to an integer using `int()`\n",
"\n",
"Converting the float `3.0` to an integer removes the decimal point:"
]
},
Expand Down Expand Up @@ -215,6 +219,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Converting to a boolean using `bool()`\n",
"\n",
"When you convert a number to a boolean using `bool()`, it is always converted to `True`, unless the number is equal to `0` (this is the only number that is converted to `False`):"
]
},
Expand Down
15 changes: 15 additions & 0 deletions content/complete/04_boolean_operations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Asking if two things are equal with `==`\n",
"\n",
"To ask a question of equality, we use two equal signs `==`"
]
},
Expand Down Expand Up @@ -86,6 +88,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Asking if two things are not equal with `!=`\n",
"\n",
"The \"not equal to\" operator is written `!=`. The following question asks if the `age` variable is \"not equal\" to 10:"
]
},
Expand Down Expand Up @@ -114,6 +118,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Less than or greater than with `<` and `>`\n",
"\n",
"Next, to ask questions of greater than or less than, we use the `<` and `>` operators:"
]
},
Expand Down Expand Up @@ -219,6 +225,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Less than or greater than for strings\n",
"\n",
"Strings are treated alphabetically, so `'apple'` is \"less\" than `'bannana'` because the first letter of apple \"a\" comes before the first letter of banana \"b\" in the alphabet:"
]
},
Expand Down Expand Up @@ -264,6 +272,13 @@
"'carrot' < 'banana'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
14 changes: 13 additions & 1 deletion content/complete/05_numpy.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installing numpy\n",
"\n",
"Just like an application on your computer, where you need to first download and install the application before you can use it on your computer, before you can use Python libraries, you need to first download and install them. \n",
"\n",
"The way that you will install Python libraries depends on your Python installation. \n",
Expand Down Expand Up @@ -66,7 +68,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You *don't* need to include this `pip install numpy` code in your notebook.\n",
"You *don't* need to include this `pip install numpy` code in your notebook.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Importing numpy\n",
"\n",
"Once you've successfully installed the numpy library once, you can import the library (make its functions available) using the `import <libraryname> as <nickname>` command below. \n",
"\n",
Expand All @@ -87,6 +97,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using numpy functions\n",
"\n",
"Let's take a look at some of the functions that the numpy library provides.\n",
"\n",
"First, let's define a variable `x` that contains the value `2`:"
Expand Down
8 changes: 8 additions & 0 deletions content/complete/06_pandas_dataframes.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loading a data file into a pandas DataFrame\n",
"\n",
"To load a .csv data file into our space, we need to use the `read_csv()` function from the pandas library. Make sure that you have saved the `gapminder.csv` file in a `data` subfolder that lives in the same place where this notebook is saved.\n",
"\n",
"Let's load the gapminder dataset:"
Expand Down Expand Up @@ -618,6 +620,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The shape attribute\n",
"\n",
"To extract an attribute from an object in Python, we use the `object.attribute` syntax. So if we want to extract the `shape` attribute from the `gapminder` DataFrame object, we can do so as follows:"
]
},
Expand Down Expand Up @@ -653,6 +657,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The head() method\n",
"\n",
"The `head()` function typically prints out the first few rows of a DataFrame. However, `head()` is not a regular function. If `head()` were a regular function, we would be able to apply it like this:"
]
},
Expand Down Expand Up @@ -794,6 +800,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Arguments\n",
"\n",
"You can provide additional arguments to the `head()` inside the parentheses. For example, if you want to print 10 rows instead of 5, you can do so as follows:"
]
},
Expand Down
2 changes: 2 additions & 0 deletions content/complete/07_index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Changing the index\n",
"\n",
"You can change the index using the `set_index()` method and providing, for example, a column name as a string."
]
},
Expand Down
12 changes: 11 additions & 1 deletion content/complete/08_series.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"There are several ways to extract a column from a DataFrame. The first, involves writing the name of the DataFrame object followed by square parentheses inside which you provide the name of the column you want to extract as a string:"
"There are several ways to extract a column from a DataFrame. \n",
"\n",
"### Method 1: Using square brackets\n",
"\n",
"The first, involves writing the name of the DataFrame object followed by square parentheses inside which you provide the name of the column you want to extract as a string:"
]
},
{
Expand Down Expand Up @@ -266,6 +270,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Method 2: Using the column attribute with `.`\n",
"\n",
"Another way to do the same thing is to use the `.` syntax to extract the named column attribute from the DataFrame object, such as:"
]
},
Expand Down Expand Up @@ -428,6 +434,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Series index\n",
"\n",
"They do however have an `index` (row name) attribute, which is inherited from the DataFrame from which the Series came:"
]
},
Expand Down Expand Up @@ -456,6 +464,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### The vectorized nature of Series objects\n",
"\n",
"The nice thing about Pandas Series objects is that they are **vectorized**. \n",
"\n",
"This means that when you apply simple mathematical operations to them, the operation will be applied to *every* entry in the Series. For example, if we add `5` to the `year` Series object, `5` will be added to *every* value in the `year` Series object:"
Expand Down
15 changes: 7 additions & 8 deletions content/complete/09_subsetting.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with DataFrames\n",
"# Extracting subsets of data frames\n",
"\n",
"In this notebook, we will learn how to manipulate pandas DataFrame objects, starting with extracting subsets."
]
Expand Down Expand Up @@ -120,13 +120,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extracting subsets of data frames"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Extracting multiple columns\n",
"\n",
"Suppose that you want to extract multiple columns at once from your DataFrame object. You might imagine that you can do this by providing two column names inside the square parentheses that follow the object name, as follows:"
]
},
Expand Down Expand Up @@ -362,6 +357,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `:` with `.loc` to select all rows/columns\n",
"\n",
"If you want to extract all rows (or columns), you can replace the corresponding index entry with `:`. So the following code will extract all rows for the `gdpPercap` column:"
]
},
Expand Down Expand Up @@ -702,6 +699,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `.loc` with non-numeric indexes\n",
"\n",
"Note that the fact that we can index the rows using `.loc` with integers is solely a result of the fact that the row index corresponds to integers. If, instead the row index corresponded to the `country` values, such as in `gapminder_country`, we would not be able to use integers to subset the rows, and we would instead need to use the country names. \n",
"\n",
"Let's create `gapminder_country`, whose row index corresponds to the country variable:"
Expand Down
11 changes: 10 additions & 1 deletion content/complete/10_filtering_logical.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Filtering using logical operations and `.loc`"
"# Filtering using logical operations and `.loc`"
]
},
{
Expand Down Expand Up @@ -114,6 +114,13 @@
"gapminder.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Filtering with `.loc` using a boolean series\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -159,6 +166,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"We can use this boolean series to subset/filter the rows of our DataFrame by providing it in the row indexing position of the `.loc` indexer. The following will filter the `gapminder` DataFrame just to the rows where the `country` value equals `'Australia'`:"
]
},
Expand Down
6 changes: 6 additions & 0 deletions content/complete/11_filtering_query.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Filtering using `.query()`\n",
"\n",
"The `.query()` method does the same thing, but the syntax is a bit different. Since `query` is a method, it is followed by round parentheses `()` rather than square parentheses `[]`, and unlike in the above examples where we need to explicitly create a boolean Series object from the `country` column, e.g., `gapminder['country'] == \"Australia\"`, we instead provide a string (text) argument in which we just write the name of the column that we are using to filter, `country`, followed by the condiiton `== \"Australia\"`."
]
},
Expand Down Expand Up @@ -653,6 +655,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### External variables in the `.query()` method\n",
"\n",
"Note that if you want to use an \"external\" variable in your filtering query, you need to access it within the argument using `@variable_name`. For example, if we have defined an external variable, `selected_country` that contains the name of the country that we want to use to filter to in our query, to access this `selected_country` variable inside our query argument, we need to write `@selected_country` with the `@` symbol, which will impute the value stored in `selected_country` when the query is executed."
]
},
Expand Down Expand Up @@ -835,6 +839,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Combining `.query()` with `.loc`\n",
"\n",
"Note that since `gapminder.query()` outputs a DataFrame itself, you can follow a query method call with further subsetting which will then apply to the outputted DataFrame. The code below filters to just the country rows equal to \"Brazil\", and then uses the `.loc` indexer to subset just the \"year\" and \"lifeExp\" columns for the eventual output:"
]
},
Expand Down
Loading

0 comments on commit c0117af

Please sign in to comment.