-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8 write create trends ensemble function #11
Conversation
The lint check is only failing due to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed this and it basically looks good. Made 1 or 2 minor suggestions. Two bigger comments:
- I had some questions around the temporal aggregation, but I thought we had discussed (or maybe I imagined discussing or planned to discuss and didn't follow through) that we were going to ditch the stuff in here that's related to temporal aggregation because the available data going forward in the near future will only be for the weekly scale. So I propose to essentially go ahead with what's here, regardless of any questions on this front. We just need to get something that handles a single temporal resolution of "weekly" in place.
- My main question is actually how we're going to deal with the sampling. Two sub-questions on this:
- I think that we should add an
n_sim
argument to this top level function to allow us to separate the concepts of (a) how many samples are generated from the predictive distribution for subsequent summarizing into predictive quantiles; and (b) how many samples are returned if we have a sample output type - Because hubEnsembles::linear_pool only handles the simplest case where the number of samples for the ensemble is unrestricted, we need to figure out a way to deal with the requirement of getting to 100 samples for the ensemble that we submit. I see two options: (a) update hubEnsembles::linear_pool to allow for specification of a target number of samples for the ensemble, doing sampling if necessary. (b) within this function, pick the number of samples to output from each baseline model so that in total you end up with 100 samples. In our setting with a target of 100 samples for the ensemble and 8 baseline models, we would have 4 baselines generate 12 samples and 4 generate 13. I'm ok with going with option (b) in the short term if it's easier, but note that we will want to do option (a) in the near future as well.
- I think that we should add an
R/aggregate_daily_to_weekly.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, it would be good to have some tests for this function. However, I am going to file an issue for that as something we could do later because: (a) it probably works! (b) I feel like we don't even know what our data will look like and if we will use this.
quantile_levels = c(.1, .5, .9), | ||
n_samples = NULL, | ||
return_baseline_predictions = FALSE) |> | ||
expect_error(regex = "Currently `component_variations` may only contain one unique temporal resolution value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Low priority -- is this true? I thought I saw stuff about splitting by the temporal resolution up above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the current ensembling for samples makes it so that having multiple temporal resolution values results in different numbers of samples per model, but I wanted to put the check closer to the top to avoid unnecessary calculations. This validation will be removed later on once support is added, but I figured I would just put it in until then
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
No description provided.