Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to reproduce results. #257

Open
prateek-malhotra opened this issue Dec 17, 2021 · 1 comment
Open

Not able to reproduce results. #257

prateek-malhotra opened this issue Dec 17, 2021 · 1 comment

Comments

@prateek-malhotra
Copy link

In the latest lolopy version (1.2.0), I fixed random_seed but still, results are not reproducible (I have already fixed numpy random seed). Can you please fix it or tell me the reason for this?

@bfolie
Copy link

bfolie commented Dec 17, 2021

Hi Prateek. Lolo training is not entirely reproducible because the base learners are trained in parallel and we don't use splittable random numbers. This is a known deficiency, but I realize we don't have an issue for it, so I opened one: #259.

I also did a sweep to make sure random number generators were being used everywhere, and I found a bug that is corrected in #258. This bug would only have affected you if you were considering a subset of features at each split (the default for regression is to consider all features for each split, in which case this bug would not have affected you).

The parallelization issue prevents full reproducibility, but it should be an extremely small effect. If it's causing your predictions to vary significantly relative to the error bars, then please say so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants