Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review HW2 #2

Open
CanyonFoot opened this issue Feb 15, 2018 · 4 comments
Open

Review HW2 #2

CanyonFoot opened this issue Feb 15, 2018 · 4 comments

Comments

@CanyonFoot
Copy link
Collaborator

@wxindu sorry I'm tagging you so late

@wxindu
Copy link

wxindu commented Feb 15, 2018

[I think the format for the previous paragraph was massed up a little bit so I have it deleted.]
It's fine.
I think overall your work looks great, and I have only few suggestions for you.

Your codes look great, all codes are good, except that maybe you should consider organizing your code a bit differently. When you have those long codes with multiple pipes, you do it as a single long line. It doesn't affect the result you get, but it can be hard to read and organize. If you ever want to insert some steps in between, it can be a bit hard to do.
For example in 4.5, you did:
planes %>% filter(year > 1960, !is.na(manufacturer)) %>% group_by(manufacturer, year) %>% summarise(total=n_distinct(tailnum)) %>% mutate(manufacturer2 = if_else(!(manufacturer %in% c("AIRBUS", "AIRBUS INDUSTRIE", "BOEING", "BOMBARDIER INC", "MCDONNELL DOUGLAS")), "Other", manufacturer)) %>%ggplot(aes(x = year, y = total, color = manufacturer2)) + geom_line()

The code looks good, but it's kind of hard to read and organize, I think it would be better if you can do:
'planes %>%
filter(year > 1960, !is.na(manufacturer)) %>%
group_by(manufacturer, year) %>%
summarise(total=n_distinct(tailnum)) %>%
mutate(manufacturer2 = if_else(!(manufacturer %in% c("AIRBUS", "AIRBUS INDUSTRIE", "BOEING", "BOMBARDIER INC", "MCDONNELL DOUGLAS")), "Other", manufacturer)) %>%
ggplot(aes(x = year, y = total, color = manufacturer2)) + geom_line()'
So that each step is listed individually, and can be organized better.

For 4.19, I really like how you insert the names and ends of presidency in your graph. It presents the relationship between presidency and popularity of the names even better. Nice work.

For 4.4, I personally don't think a full join is needed here. But both ways should work.

Good Work

@wxindu
Copy link

wxindu commented Feb 15, 2018

4.4 Oops. I just reread the question. I think your idea is right, we do need to join two tables together, but I don't think it should be full join. The question asked:
"What is the oldest plane (speci ed by the tailnum variable) that few from PNW airports in 2014?"
I think we need an inner_join here instead, since we need to eliminate those planes which did not flew from PNW.

@CanyonFoot
Copy link
Collaborator Author

Thanks. I think I misunderstood what the planes dataset represented, which is why I used the full join. Re: line breaks I agree and I forgot that what the code looks like in Rstudio isn't quite how it comes out when knitted. I will definitely make sure to do a new line for every pipe from now on.

@CanyonFoot
Copy link
Collaborator Author

I just went and fixed the line breaks because it was annoying me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants