Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EDA Checkpoint Feedback #17

Open
ShanEllis opened this issue Mar 12, 2024 · 0 comments
Open

EDA Checkpoint Feedback #17

ShanEllis opened this issue Mar 12, 2024 · 0 comments

Comments

@ShanEllis
Copy link
Contributor

ShanEllis commented Mar 12, 2024

EDA Checkpoint Feedback

Score (out of 5 pts)

Score = 5

EDA Checkpoint Feedback

Quality Reasons
EDA Relevance D Good overall, but not sure how much a linear regression counts as EDA (it would depend on what your main analysis is).
EDA Analysis and Description D Mostly good, but please separate your text blocks so that the text associated with the analysis in a code block is next to that code block (this makes it a lot easier to read and go between the two). For your regression plot with accident rate, population, and transit, it might be clearer to have two separate y-axes (one on the left and one on the right) reflecting each variable. Also, normalized rates are often presented as, e.g., accidents per 10,000/100,000 population (or something that makes sense). Finally, it's probably worth plotting the distribution of all your variables and looking to see if any others should be log-transformed (population often is, for example; and it's hard to tell from the plots, but accident rate may be something that should also be log-transformed).
EDA Figures P

Comments

Regrade Feedback

Data Checkpoint

Quality Reasons
Data relevance P
Data description D It can be good to explain what you're doing as you do it. The Data section also says things like "In terms of pre-processing, we cleaned the data as we entered it into our dataset. We checked that it included at least a year’s worth of data to observe trends for appropriate time frames. We checked that it was consistent with our other data set. We checked that it was a trustworthy source." How? What is the trustworthy source? Give details when saying things like this.
Data wrangling P

Rubric

Unsatisfactory Developing Proficient Excellent
EDA relevance EDA is mostly neither relevant to the question nor helpful in figuring out how to address the question. Or the EDA does address the question, but many obviously relevant variables / analyses / figures were not included. EDA does not include explore distributions of single variables or relationships between variables or both EDA is partly irrelevant/unhelpful. Or some obviously relevant variables / analyses / figures were not included. EDA does not include a few distributions of single variables or relationships between variables EDA is almost all relevant / helpful in addressing the question. No obviously relevant variables / analyses / figures were not included. Thorough EDA addressed all aspects that are relevant to the question
EDA analysis and description Many of the analyses are poor choices (e.g., using means instead of medians for obviously skewed data), or are poorly described in the text, or do not aid understanding the data Some of the analyses are poor choices, or are poorly described in the text, or do not aid understanding the data All analyses are correct choices. Only one or two have minor issues in the text descriptions supporting them. Mostly they fit well with other elements of the EDA and support understanding the data All analyses are correct choices with clear text descriptions supporting them. The figures fit well with the other elements of the EDA, producing a clear understanding of the data.
EDA figures Many of the figures are poor plot choices (e.g., using a bar plot to represent a time series where it would be better to use a line plot) or have poor aesthetics (including colormap, data point shape/color, axis labels, titles, annotations, text legibility) or do not aid understanding the data Some of the figures are poor plot choices or have poor aesthetics. Some figures do not aid understanding the data All figures are correct plot choices. Only one or two have minor questionable aesthetic choices. The figures mostly fit well with the other elements of the EDA and support understanding the data All figures are correct plot choices with beautiful aesthetics. The figures fit well with the other elements of the EDA, producing a clear understanding of the data.

Grading Rules

Scoring: Out of 5 points

Each Developing => -1 pts
Each Unsatisfactory=> -2 pts
until the score is 0

If students address the detailed feedback in a future checkpoint they will earn these points back

DETAILED FEEDBACK should be left in the data section AND anywhere the student addressed proposal feedback but did not do it to your satisfaction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant