Info on pySCENIC for large datasets #287

laterz-f · 2022-04-13T12:31:35Z

laterz-f
Apr 13, 2022

Hi there,

I am trying to run pySCENIC on a large dataset (151k cells x 16K genes), using HPC with 200GB of memory and 8 cpus per task.
Currently I am computing the adjacency matrix using the arboreto-with-multiprocessing script with 20 workers, as dask was giving me some memory allocation warnings.
So far it seems to be working, though it has been running for around 24 hours and the computation is still at 13%.
I was wondering if you had experience on running pySCENIC on large datasets and can give me some advice on how to optimize the computation (and in your experience how long would you expect the different steps to be running for this large dataset).

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Info on pySCENIC for large datasets #287

{{title}}

Replies: 0 comments

Select a reply

Info on pySCENIC for large datasets #287

laterz-f Apr 13, 2022

Replies: 0 comments

laterz-f
Apr 13, 2022