Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thrashing in 2.TCGA-MLexample #86

Closed
jruhym opened this issue Mar 1, 2017 · 2 comments
Closed

Thrashing in 2.TCGA-MLexample #86

jruhym opened this issue Mar 1, 2017 · 2 comments

Comments

@jruhym
Copy link

jruhym commented Mar 1, 2017

When my MBP with 16 GB of RAM hits this line cv_pipeline = GridSearchCV(estimator=pipeline, param_grid=param_grid, n_jobs=-1, scoring='roc_auc') it thrashes. The n_jobs parameter causes multiple jobs to be created, with them a higher demand on RAM. My MBP has a i7 processor which is hyper-threaded. When n_jobs=-1, it spins up as many tasks as the machine has cores, but it thinks that my machine has 8 cores when it really only has 4 and 4 virtual cores. Hyper-threading uses inefficiency in the pipeline to create a virtual pipeline. This works fine for multi-tasking like browsing and editing but not for processor hungry tasks like GridSearchCV that most likely do not use the real pipeline inefficiently. So 8 tasks was swamping my RAM and there is probably no benefit on my system to spinning up more than 4.
To fix this, I merely set n_jobs=4.

@jruhym
Copy link
Author

jruhym commented Mar 1, 2017

thrashing

@jruhym
Copy link
Author

jruhym commented Mar 2, 2017

Relates to #70

@jruhym jruhym closed this as completed Mar 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant