Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENG-6165] bugfixing followup #10746

Merged

Conversation

aaxelb
Copy link
Contributor

@aaxelb aaxelb commented Sep 9, 2024

Purpose

fix bugs found running monthly institutional-user report on actual data

Changes

QA Notes

Please make verification statements inspired by your code and what your code touches.

  • Verify
  • Verify

What are the areas of risk?

Any concerns/considerations/questions that development raised?

Documentation

Side Effects

Ticket

@aaxelb aaxelb marked this pull request as ready for review September 9, 2024 18:17
Copy link
Member

@mfraezz mfraezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, a couple minor questions.

Ran some benchmarking against prod data, expected prod runtime is ~10min, which is acceptable for a once-monthly async job like this.

Pass complete :octocat:

@@ -20,7 +20,7 @@ def report(
def run_and_record_for_month(self, report_yearmonth: YearMonth) -> None:
reports = self.report(report_yearmonth)
for report in reports:
assert report.report_yearmonth == str(report_yearmonth)
assert str(report.report_yearmonth) == str(report_yearmonth)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The other MonthlyReporter casts report_yearmonth to str on report initialization -- should this one as well, instead of changing this assertion? It may not matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm agree this is odd -- now looking at it again, seems better to set report.report_yearmonth here, instead of making the reporter do it (have updated)

@@ -37,7 +37,7 @@ class Meta:

class YearmonthField(metrics.Date):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs, format='strict_year_month', required=True)
super().__init__(*args, **kwargs, format='strict_year_month')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This makes a lot of YearMonth fields no longer implicitly (sort of) required. Would it be preferable to default to required=True but allow required=False to be passed in? e.g. with required = kwargs.pop('required', True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default for elasticsearch_dsl.Field is required=False, and consistency with that seems like a good idea -- there was only one yearmonth field prior to this project (on the MonthlyReport base class), which now has explicit required=True, and the ones added for this project seem fine to make optional

@aaxelb
Copy link
Contributor Author

aaxelb commented Sep 9, 2024

(fyi, changed some field mappings after api work; cherry-picked to here trying to avoid changing mappings again after this pr if we can avoid it)

@aaxelb aaxelb merged commit 4a2e997 into CenterForOpenScience:feature/insti-dash-improv Sep 10, 2024
6 checks passed
@aaxelb aaxelb deleted the eng-6165-followup branch September 10, 2024 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants