From 34e6a84d706196647db9d77d72c082e5022b031f Mon Sep 17 00:00:00 2001 From: Casey Youngflesh Date: Mon, 6 Jan 2025 15:34:37 -0500 Subject: [PATCH] Add data in place of `feline-data_v2.csv`, closes #717 --- episodes/04-data-structures-part1.Rmd | 52 +++++++++++---------------- 1 file changed, 21 insertions(+), 31 deletions(-) diff --git a/episodes/04-data-structures-part1.Rmd b/episodes/04-data-structures-part1.Rmd index 38bc2f96c..1d131c13b 100644 --- a/episodes/04-data-structures-part1.Rmd +++ b/episodes/04-data-structures-part1.Rmd @@ -164,34 +164,26 @@ No matter how complicated our analyses become, all data in R is interpreted as one of these basic data types. This strictness has some really important consequences. -A user has added details of another cat. This information is in the file -`data/feline-data_v2.csv`. +A user has added details of another cat. We can add an additional row to our cats `data.frame` using `rbind`. -```{r, eval=FALSE} -file.show("data/feline-data_v2.csv") -``` - -```{r, eval=FALSE} -coat,weight,likes_catnip -calico,2.1,1 -black,5.0,0 -tabby,3.2,1 -tabby,2.3 or 2.4,1 +```{r} +additional_cat <- data.frame(coat = "tabby", weight = "2.3 or 2.4", likes_catnip = 1) +cats2 <- rbind(cats, additional_cat) +cats2 ``` -Load the new cats data like before, and check what type of data we find in the -`weight` column: +Let's check what type of data we find in the +`weight` column of our new object: ```{r} -cats <- read.csv(file="data/feline-data_v2.csv") -typeof(cats$weight) +typeof(cats2$weight) ``` Oh no, our weights aren't the double type anymore! If we try to do the same math we did on them before, we run into trouble: ```{r} -cats$weight + 2 +cats2$weight + 2 ``` What happened? @@ -206,7 +198,7 @@ csv file, it is stored as a data frame. We can recognize data frames by the firs is written by the `str()` function: ```{r} -str(cats) +str(cats2) ``` *Data frames* are composed of rows and columns, where each column has the @@ -389,8 +381,7 @@ Create a new script in RStudio and copy and paste the following code. Then move on to the tasks below, which help you to fill in the gaps (\_\_\_\_\_\_). ``` -# Read data -cats <- read.csv("data/feline-data_v2.csv") +Using the object `cats2`: # 1. Print the data _____ @@ -402,15 +393,15 @@ _____(cats) # The correct data type is: ____________. # 4. Correct the 4th weight data point with the mean of the two given values -cats$weight[4] <- 2.35 +cats2$weight[4] <- 2.35 # print the data again to see the effect cats # 5. Convert the weight to the right data type -cats$weight <- ______________(cats$weight) +cats2$weight <- ______________(cats2$weight) # Calculate the mean to test yourself -mean(cats$weight) +mean(cats2$weight) # If you see the correct mean value (and not NA), you did the exercise # correctly! @@ -420,7 +411,7 @@ mean(cats$weight) #### 1\. Print the data -Execute the first statement (`read.csv(...)`). Then print the data to the +Print the data to the console ::::::::::::::: solution @@ -435,8 +426,8 @@ Show the content of any variable by typing its name. Two correct solutions: ``` -cats -print(cats) +cats2 +print(cats2) ``` ::::::::::::::::::::::::: @@ -445,7 +436,7 @@ print(cats) The data type of your data is as important as the data itself. Use a function we saw earlier to print out the data types of all columns of the -`cats` table. +`cats2` `data.frame`. ::::::::::::::: solution @@ -462,7 +453,7 @@ here. > ### Solution to Challenge 1.2 > > ``` -> str(cats) +> str(cats2) > ``` #### 3\. Which data type do we need? @@ -470,7 +461,6 @@ here. The shown data type is not the right one for this data (weight of a cat). Which data type do we need? -- Why did the `read.csv()` function not choose the correct data type? - Fill in the gap in the comment with the correct data type for cat weight! ::::::::::::::: solution @@ -549,8 +539,8 @@ auto-complete function: Type "`as.`" and then press the TAB key. > There are two functions that are synonymous for historic reasons: > > ``` -> cats$weight <- as.double(cats$weight) -> cats$weight <- as.numeric(cats$weight) +> cats2$weight <- as.double(cats2$weight) +> cats2$weight <- as.numeric(cats2$weight) > ``` ::::::::::::::::::::::::::::::::::::::::::::::::::