You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a project where we combine XGBoost with Tensorflow within the same process, we ran into the following issue:
When the environment variable CUDA_VISIBLE_DEVICES is set to -1, the XGBoost predict step function crashes after about a minute of predicting. Strangely enough, it seems to happen stochastically. The crash only occurs after predicting for a while, either by setting nthread to a low value, or by repeating the same predict step many times. Doing the predict step once usually works without the crash, but not always.
The crash does not produce any error messages and only happens on Windows, as far as I can tell.
It's part of a dependency that uses TensorFlow and which is used before the dependency that uses XGBoost. In short, it's a pipeline that combines multiple machine learning predictors, each with their own purpose.
As a simple workaround we can definitely remove the environment variable before predicting with XGBoost. Nevertheless, it seemed sensible to report the issue.
In a project where we combine XGBoost with Tensorflow within the same process, we ran into the following issue:
When the environment variable
CUDA_VISIBLE_DEVICES
is set to-1
, the XGBoost predict step function crashes after about a minute of predicting. Strangely enough, it seems to happen stochastically. The crash only occurs after predicting for a while, either by settingnthread
to a low value, or by repeating the same predict step many times. Doing the predict step once usually works without the crash, but not always.The crash does not produce any error messages and only happens on Windows, as far as I can tell.
Here's a script to reproduce:
Comment out the first two lines makes it work again.
pip freeze output:
And with optional
rich
install for the progress bar (does not change the crash behavior):Files used: https://1drv.ms/u/c/cc884c602a30d109/ET6oclsK3PpLqnj6p4W0h40BU2vIMXQzQnOWRLl5SfecFw?e=eCoexz
The text was updated successfully, but these errors were encountered: