diff --git a/docs/assignments/HW2.Rmd b/docs/assignments/HW2.Rmd new file mode 100644 index 0000000..fbfc7bd --- /dev/null +++ b/docs/assignments/HW2.Rmd @@ -0,0 +1,50 @@ +--- +title: "Homework 2" +--- + +```{r global_options, include=FALSE} +library(knitr) +library(tidyverse) +opts_chunk$set(fig.align="center", fig.height=4, fig.width=5.5) + +# data prep: +txhouse <- txhousing %>% + filter(city %in% c('Austin', 'Houston', 'San Antonio', 'Dallas')) %>% + filter(year %in% c('2000', '2005', '2010', '2015')) %>% + group_by(city, year) %>% + summarize(total_sales = sum(sales)) +``` + +**This homework is due on Feb. 1, 2024 at 11:00pm. Please submit as a pdf file on Canvas.** + +**Problem 1: (6 pts)** For this Problem you will be working with the `iris` dataset built into R. This data set contains measurements of flowers (sepal length, sepal width, petal length, petal width) for three different *Iris* species (*I. setosa*, *I. versicolor*, *I. virginica*). + +```{r} +head(iris) +``` + +Use ggplot to make a histogram of the `Sepal.Length` column. Manually choose appropriate values for `binwidth` and `center`. Explain your choice of values in 2-3 sentences. + +```{r} +# Your code goes here. +``` + +*Your explanation goes here.* + +**Problem 2: (6 pts)** For this problem you will work with the dataset `txhouse` that has been derived from the `txhousing` dataset provided by **ggplot2**. See here for details of the original dataset: https://ggplot2.tidyverse.org/reference/txhousing.html. `txhouse` contains three columns: `city` (listing four Texas cities), `year` (containing four years between 2000 and 2015) and `total_sales` indicating the total number of sales for the specified year and city. + +```{r} +txhouse +``` + +Use ggplot to make a bar plot of the total housing sales (column `total_sales`) for each `year`, color the bar borders "gray34", and fill the bars by `city`. + +```{r} +# Your code goes here. +``` + +**Problem 3: (8 pts)** Modify the plot from Problem 2 by placing `city` bars side-by-side, rather than stacked. See Slide 35 from the lecture on visualizing amounts. Next, reorder the bars for each `year` by `total_sales` in descending order. See Slide 25 from the lecture on visualizing amounts. + +```{r} +# Your code goes here. +``` diff --git a/docs/assignments/HW2.html b/docs/assignments/HW2.html new file mode 100644 index 0000000..5edbfeb --- /dev/null +++ b/docs/assignments/HW2.html @@ -0,0 +1,462 @@ + + + + +
+ + + + + + + + +This homework is due on Feb. 1, 2024 at 11:00pm. Please +submit as a pdf file on Canvas.
+Problem 1: (6 pts) For this Problem you will be
+working with the iris
dataset built into R. This data set
+contains measurements of flowers (sepal length, sepal width, petal
+length, petal width) for three different Iris species (I.
+setosa, I. versicolor, I. virginica).
head(iris)
+## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
+## 1 5.1 3.5 1.4 0.2 setosa
+## 2 4.9 3.0 1.4 0.2 setosa
+## 3 4.7 3.2 1.3 0.2 setosa
+## 4 4.6 3.1 1.5 0.2 setosa
+## 5 5.0 3.6 1.4 0.2 setosa
+## 6 5.4 3.9 1.7 0.4 setosa
+Use ggplot to make a histogram of the Sepal.Length
+column. Manually choose appropriate values for binwidth
and
+center
. Explain your choice of values in 2-3 sentences.
# Your code goes here.
+Your explanation goes here.
+Problem 2: (6 pts) For this problem you will work
+with the dataset txhouse
that has been derived from the
+txhousing
dataset provided by ggplot2. See
+here for details of the original dataset: https://ggplot2.tidyverse.org/reference/txhousing.html.
+txhouse
contains three columns: city
(listing
+four Texas cities), year
(containing four years between
+2000 and 2015) and total_sales
indicating the total number
+of sales for the specified year and city.
txhouse
+## # A tibble: 16 × 3
+## # Groups: city [4]
+## city year total_sales
+## <chr> <int> <dbl>
+## 1 Austin 2000 18621
+## 2 Austin 2005 26905
+## 3 Austin 2010 19872
+## 4 Austin 2015 18878
+## 5 Dallas 2000 45446
+## 6 Dallas 2005 59980
+## 7 Dallas 2010 42383
+## 8 Dallas 2015 36735
+## 9 Houston 2000 52459
+## 10 Houston 2005 72800
+## 11 Houston 2010 56807
+## 12 Houston 2015 48109
+## 13 San Antonio 2000 15590
+## 14 San Antonio 2005 24034
+## 15 San Antonio 2010 18449
+## 16 San Antonio 2015 16455
+Use ggplot to make a bar plot of the total housing sales (column
+total_sales
) for each year
, color the bar
+borders “gray34”, and fill the bars by city
.
# Your code goes here.
+Problem 3: (8 pts) Modify the plot from Problem 2 by
+placing city
bars side-by-side, rather than stacked. See
+Slide 35 from the lecture on visualizing amounts. Next, reorder the bars
+for each year
by total_sales
in descending
+order. See Slide 25 from the lecture on visualizing amounts.
# Your code goes here.
+
+
+
+
+