Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare results of run to previous day and fail if there's a large discrepancy #28

Open
jonespm opened this issue Mar 18, 2021 · 2 comments

Comments

@jonespm
Copy link
Member

jonespm commented Mar 18, 2021

A few times we've had an issue come up where the results from the days run with 2x the previous days run. We should either

  • Store all of the runs in a database, something simple like 2 tables
job_run -> job run id | time | result
job_run_property -> job_run_id | property | value
  • Store just the previous days runs in a file on the filesystem, either SQLLite or just JSON. And use that to compare.

We should set a difference indicator, I feel like it could be different per table and whether it's an increase or decrease, but +- 25% seems like a good start?

@jonespm
Copy link
Member Author

jonespm commented Mar 25, 2021

The most common time we see this problem is when a table has double the results from the previous run. (Seen in the past in submission_dim)

@jonespm
Copy link
Member Author

jonespm commented Jun 28, 2021

There was an issue today where course_dim had 0.5% less courses than the day before. This was mentioned as a case that would be worth notifying on. If something drops by a specified amount. But we'll need to store this data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant