You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run pySCENIC on a large dataset (151k cells x 16K genes), using HPC with 200GB of memory and 8 cpus per task.
Currently I am computing the adjacency matrix using the arboreto-with-multiprocessing script with 20 workers, as dask was giving me some memory allocation warnings.
So far it seems to be working, though it has been running for around 24 hours and the computation is still at 13%.
I was wondering if you had experience on running pySCENIC on large datasets and can give me some advice on how to optimize the computation (and in your experience how long would you expect the different steps to be running for this large dataset).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi there,
I am trying to run pySCENIC on a large dataset (151k cells x 16K genes), using HPC with 200GB of memory and 8 cpus per task.
Currently I am computing the adjacency matrix using the arboreto-with-multiprocessing script with 20 workers, as dask was giving me some memory allocation warnings.
So far it seems to be working, though it has been running for around 24 hours and the computation is still at 13%.
I was wondering if you had experience on running pySCENIC on large datasets and can give me some advice on how to optimize the computation (and in your experience how long would you expect the different steps to be running for this large dataset).
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions