Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev alex 2020 07 20 #220

Open
wants to merge 12 commits into
base: dev_Ryan_2020-05-11
Choose a base branch
from
Open

Conversation

alexp205
Copy link
Contributor

Important changes:

  • added relative error plots
  • adjusted linear fits in res-vs-error plots to be weighted according to counts
  • adjusted EnsembleRegressor to be more forgiving to poorly specified parameters

Some things I still need to look at:

  • change the default KerasRegressor verbosity back to self.verbose, I forgot that was something I could specify
  • make sure the added relative error plots are functional in non-error metric cases (i.e. not GPR, RF, and ensembles)

@alexp205
Copy link
Contributor Author

Important changes:

  • updated res-vs-error plots to have adaptable axes and no visual indication of potential outliers (i.e. red points)
  • fixed small bug in EnsembleRegressor that would not process parameters correctly

Some things I still need to look at (forgot/didn't have time to do these):

  • change the default KerasRegressor verbosity back to self.verbose, I forgot that was something I could specify
  • make sure the added relative error plots are functional in non-error metric cases (i.e. not GPR, RF, and ensembles

…s, relative error plots should now only be plotted if predicted error models are used
@alexp205
Copy link
Contributor Author

alexp205 commented Aug 5, 2020

Important changes:

  • added tags and altered some parameters/paths to ensure that jupyter notebooks were constructed correctly for error plots
  • changed KerasRegressor verbosity back to self.verbose
  • made sure the added relative error plots were functional in non-error metric cases (i.e. not GPR, RF, and ensembles)

@alexp205
Copy link
Contributor Author

alexp205 commented Aug 5, 2020

I setup a new conda env, cloned my branch's version of MASTML, and tested a new install to see what was necessary for a default Windows-functional setup. I made some changes to setup.py to facilitate this. Specifically, dlhub_sdk tries to import a broken package (on Windows) through parsl which only functions in version 0.9.0 (instead of the latest 1.0.0). Also, re-added dlhub_sdk to the req list.

@alexp205
Copy link
Contributor Author

alexp205 commented Aug 6, 2020

Summary of Important Changes:

  • added relative error plots
  • adjusted linear fits in res-vs-error plots to be weighted according to counts
  • adjusted EnsembleRegressor to be more forgiving to poorly specified parameters
  • updated res-vs-error plots to have adaptable axes and no visual indication of potential outliers (i.e. red points)
  • fixed small bug in EnsembleRegressor that would not process parameters correctly
  • added tags and altered some parameters/paths to ensure that jupyter notebooks were constructed correctly for error plots
  • changed KerasRegressor verbosity back to self.verbose
  • made sure the added relative error plots were functional in non-error metric cases (i.e. not GPR, RF, and ensembles)
  • fixed setup.py for correct out-of-the-box functionality for Windows and Linux

Note that I could not test on Mac so I can't guarantee it works directly for those systems

@alexp205
Copy link
Contributor Author

*more Important Changes:

  • also added some options in the conf file for adjusting the resolution of the error plots (i.e. number of bins, upper bin value, and the requisite logic)

@alexp205
Copy link
Contributor Author

*more Important Changes:

  • added function header comments for auto-documentation functionality, EnsembleRegressor, the jackknife method, and RvE plots in MASTML should now be documented properly

Some future work that Prof. Morgan wants done in MASTML, I've also described these in an issue (#221):

  • stats_check_models in mastml/legos/model_finder.py (under EnsembleRegressor) needs two more parameters (that can be specified in the conf file) controlling 1) the alpha threshold for accepting models, and 2) deciding whether or not to actually eliminate models and re-calculate
  • binning resolution can be specified right now but Prof. Morgan wants another way using specification of the bin width (instead of just the number of bins and the value of the last bin), logic for using bin width to control the error plot resolution (i.e. bin resolution) and handling of specifying all three of these values is not implemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants