Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basecalling/demux+modbases: std::bad_alloc #1213

Open
sklages opened this issue Jan 9, 2025 · 0 comments
Open

Basecalling/demux+modbases: std::bad_alloc #1213

sklages opened this issue Jan 9, 2025 · 0 comments

Comments

@sklages
Copy link

sklages commented Jan 9, 2025

I have a strange issue with v0.8.3 when basecalling/demux in sup mode with modbases on a Nvidia A100/40G.

[2025-01-07 07:44:46.218] [info] Running:
"basecaller" "sup,5mCG_5hmCG" "/dev/shm/mxqd/mnt/job/53410651"
"--device" "cuda:all" "--batchsize" "0" "--trim" "all" "--verbose"
"--kit-name" "SQK-NBD114-96" "--sample-sheet" "/path/to/samplesheet.csv"
[2025-01-07 07:44:46.566] [debug] set models directory to: 
'/path/to/v0.8.3-Release/models' from 'DORADO_MODELS_DIRECTORY' environment variable
[2025-01-07 07:44:46.657] [info] > Creating basecall pipeline

<..>

[2025-01-09 11:59:27.922] [info] > Simplex reads basecalled: 152677113
[2025-01-09 11:59:27.922] [info] > Simplex reads filtered: 3048
[2025-01-09 11:59:27.922] [info] > Basecalled @ Samples/s: 7.592271e+06
[2025-01-09 11:59:27.922] [debug] > Including Padding @ Samples/s: 1.051e+07 (72.25%)
[2025-01-09 11:59:27.922] [info] > 154785331 reads demuxed @ classifications/s: 8.230199e+02
[2025-01-09 11:59:27.922] [debug] Barcode distribution :
[2025-01-09 11:59:27.922] [debug] SQK-NBD114-96_barcode70 : 48370444
[2025-01-09 11:59:27.922] [debug] SQK-NBD114-96_barcode71 : 37403828
[2025-01-09 11:59:27.922] [debug] SQK-NBD114-96_barcode72 : 32852492
[2025-01-09 11:59:27.922] [debug] SQK-NBD114-96_barcode73 : 29870372
[2025-01-09 11:59:27.922] [debug] unclassified : 6288195
[2025-01-09 11:59:27.941] [debug] Classified rate 95.93747%
[2025-01-09 12:00:07.015] [info] > Finished
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

.. directly after basecalling has finished (after appr 53h) ..

  • free disk space 76TB
  • RAM usage was below 20G (system RAM is 384G, single user, single job)

That happened with two datasets, short insert libraries, many reads. Never seen this with dorado before.

What did dorado cause to crash immediately after basecalling has finished?

Result files seem to be complete though, e.g.:

# dorado reports
[info] > 154785331 reads demuxed

# samtools reports on BAM file
154785331 + 0 primary

Any idea what is going wrong here, what is causing the std::bad_alloc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant