Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHOGUN silently produces empty output alignment when BURST segfaults #18

Open
tanaes opened this issue Apr 20, 2018 · 19 comments
Open

SHOGUN silently produces empty output alignment when BURST segfaults #18

tanaes opened this issue Apr 20, 2018 · 19 comments

Comments

@tanaes
Copy link

tanaes commented Apr 20, 2018

Hey guys,

We've been trying to track down a problem while adapting SHOGUN to Qiita, the symptom of which was finding this message when running integration tests in Travis:

+   File "/home/travis/build/qiita-spots/qp-shotgun/miniconda3/envs/qp-shotgun/lib/python3.5/site-packages/pandas/core/groupby.py", line 2934, in _get_grouper
+     raise KeyError(gpr)
+ KeyError: 'summary'

@antgonza also was having the same error on his OS X install, but neither I (on Barnacle) nor @semarpetrus (on his Linux box) were encountering it.

Running SHOGUN directly using the following commands yielded a good alignment + downstream files on Barnacle:

aln_out=foo.align
database=/home/jgsanders/git_sw/qp-shotgun/qp_shotgun/shogun/databases/shogun
level=species
aligner=burst
threads=8
profile=profile.tsv
aln_out_fp=foo.align/alignment.burst.b6
redistributed="profile.${level}.tsv"
fun_output=functional

shogun align \
--aligner ${aligner} \
--threads ${threads} \
--database ${database} \
--input combined.fna \
--output ${aln_out}

shogun assign_taxonomy \
--aligner ${aligner} \
--database ${database} \
--input ${aln_out_fp} \
--output ${profile}

shogun redistribute \
--database ${database} \
--level ${level} \
--input ${profile} \
--output ${redistributed}

fun_level=$level
shogun functional \
--database ${database} \
--input ${profile} \
--output ${fun_output} \
--level ${fun_level}

where the test database is here and the input data are here

Running the same align command on an OS X box (using Gabe's supplied burst15 binary) ran for a bit and then produced an empty .b6 output file.

Running BURST directly on the OS X box produced the following output:

burst15 --references qp_shotgun/shogun/databases/shogun/burst/5min.edx --queries combined.fna  --output test.b6 --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx
This is BURST [v0.99.7LL]
 --> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx
Using up to AVX-128 with 8 threads.
 --> [Accel] Accelerator found. Parsing...
 --> [Accel] Total accelerants: 805949 [bytes = 2106932]
 --> [Accel] Reading 0 ambiguous entries

EDB database provided. Parsing...
 --> EDB: Fingerprints are DISABLED
 --> EDB: Parsing compressed headers
 --> EDB: Sheared database (shear size = 515)
 --> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR
Parsed 400000 queries (0.071752). Calculating minMax...
Found min 150, max 150 (0.000109).
Converting queries... Converted (0.007549)
Copying queries... Copied (0.002561)
Sorting queries... Sorted (0.088294)
Copying indices... Copied (0.001531)
Determining uniqueness... Done (0.007544). Number unique: 397338
Collecting unique sequences... Done (0.001721)
Creating data structures... Done (0.004528) [maxED: 4]
Determining query ambiguity... Determined (0.023589)
Creating bins... Created (0.011927); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338]
Re-sorting... Re-sorted (0.194431)
Calculating divergence... Calculated (0.009815) [10.120026 avg div; 150 max]
Fingerprints not enabled
Setting QBUNCH to 16
Using ACCELERATOR to align 397338 unique queries...
Search Progress: [100.00%]
Search complete. Consolidating results...
Segmentation fault: 11

What do you think?

@GabeAl
Copy link

GabeAl commented Apr 20, 2018 via email

@antgonza
Copy link

antgonza commented Apr 20, 2018

In travis, we are get between 4 GB and 7.5 GB. Note that we are using Sudo-enabled builds more info.

Locally, I have a MacBookPro14,3, with 16 GB

@bhillmann
Copy link
Collaborator

@tanaes SHOGUN doesn't pick up the failed signal from BURST? Python's subprocess call should log it.

@tanaes
Copy link
Author

tanaes commented Apr 20, 2018

Under default parameters, it gave no output to STDOUT or STDERR, just produced an empty alignment file.

@GabeAl
Copy link

GabeAl commented Apr 20, 2018

What command was used to build the database? Also, does the attached linux binary (compiled from the same code used to compile the Mac binary) work on your high-RAM linux systems? Trying to rule out database creation commands as well as differences in code since the older existing linux version.

burst15.zip

I ran
burst15 -r 5min.fna -a 5min.acx -o 5min.edx -d DNA -s

Then aligned with
burst15 -r 5min.edx -a 5min.acx -q combined.fna -o test.b6

According to my run with /usr/bin/time -v, this took 12GB of RAM to run. Insufficient RAM might then explain the travis failure, but it's unclear what's causing the Mac failure (unless you had over 4GB consumed by other programs at runtime, leaving less than 12GB for burst15).

BURST15 will always reserve ~8GB (the size of the index table in the "database15" mode, adjusted for number of threads) plus the size of the database itself (minimum 4GB), so it'll yank 12GB to run (burst12 can run in under 128MB so that's the one recommended for laptops!).

@antgonza
Copy link

Thanks! I'll let @tanaes answer those specific questions. Just out of curiosity, will 15/12 yield the same results? Either way, what are the differences?

@tanaes
Copy link
Author

tanaes commented Apr 23, 2018

@GabeAl The attached binary does indeed segfault on our high memory linux machine. Here's the output (here, the ./burst15 is the one attached above):

☕  barnacle:qp-shotgun $ ./burst15 \
> --references qp_shotgun/shogun/databases/shogun/burst/5min.edx \
> --queries combined.fna  \
> --output test.b6 \
> --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx
This is BURST [v0.99.7LL]
 --> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx
Using up to AVX-128 with 24 threads.
 --> [Accel] Accelerator found. Parsing...

 --> [Accel] Total accelerants: 805949 [bytes = 2106932]
 --> [Accel] Reading 0 ambiguous entries

EDB database provided. Parsing...
 --> EDB: Fingerprints are DISABLED
 --> EDB: Parsing compressed headers
 --> EDB: Sheared database (shear size = 515)
 --> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR
Parsed 400000 queries (0.089528). Calculating minMax... 
Found min 150, max 150 (0.000125).
Converting queries... Converted (0.007726)
Copying queries... Copied (0.004054)
Sorting queries... Sorted (0.125254)
Copying indices... Copied (0.000616)
Determining uniqueness... Done (0.004894). Number unique: 397338
Collecting unique sequences... Done (0.001327)
Creating data structures... Done (0.006473) [maxED: 4]
Determining query ambiguity... Determined (0.012322)
Creating bins... Created (0.012095); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338]
Re-sorting... Re-sorted (0.322825)
Calculating divergence... Calculated (0.007467) [10.120026 avg div; 150 max]
Fingerprints not enabled
Setting QBUNCH to 16
Using ACCELERATOR to align 397338 unique queries...
Search Progress: [100.00%]
Search complete. Consolidating results...
Segmentation fault (core dumped)
☕  barnacle:qp-shotgun $ ls
burst15  combined.fna  LICENSE  qp_shotgun  README.rst  scripts  setup.py  support_files  test  test.b6
☕  barnacle:qp-shotgun $ ~/miniconda/envs/oecophylla-shogun/bin/burst15 \
> --references qp_shotgun/shogun/databases/shogun/burst/5min.edx \
> --queries combined.fna  \
> --output test.b6 \
> --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx
This is BURST [v0.99.7f]
 --> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx
Using up to AVX-128 with 24 threads.
 --> [Accel] Accelerator found. Parsing...
 --> [Accel] Total accelerants: 805949 [bytes = 2106932]
 --> [Accel] Reading 0 ambiguous entries

EDB database provided. Parsing...
 --> EDB: Fingerprints are DISABLED
 --> EDB: Parsing compressed headers
 --> EDB: Sheared database (shear size = 515)
 --> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR
Parsed 400000 queries (0.085349). Calculating minMax... 
Found min 150, max 150 (0.000108).
Converting queries... Converted (0.007505)
Copying queries... Copied (0.004179)
Sorting queries... Sorted (0.131057)
Copying indices... Copied (0.006557)
Determining uniqueness... Done (0.006628). Number unique: 397338
Collecting unique sequences... Done (0.005024)
Creating data structures... Done (0.007195) [maxED: 4]
Determining query ambiguity... Determined (0.018151)
Creating bins... Created (0.016560); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338]
Re-sorting... Re-sorted (0.340644)
Calculating divergence... Calculated (0.007354) [10.120026 avg div; 150 max]
Fingerprints not enabled
Setting QBUNCH to 16
Using ACCELERATOR to align 397338 unique queries...
Search Progress: [100.00%]
Search complete. Consolidating results...
CAPITALIST: Processed 329 investments

Alignment time: 42.566155 seconds

What's the difference, again, between burst12 and burst15? Does the database need to be reindexed for one vs the other?

@GabeAl
Copy link

GabeAl commented Apr 23, 2018

This is indeed interesting. Could you share the commandline that was used to make the burst database? It seems to differ from what I used here: burst15 -r 5min.fna -a 5min.acx -o 5min.edx -d DNA -s

In any case, there may be a combination bug that arises from some mix of DB commandline and the most recent changes to CAPITALIST (and/or tallying reads in general).

A couple questions to help me hone in:

  • Does it crash if you use "-m BEST" ?
  • What commandline was used to generate the burst database (.acx and .edx)?

@GabeAl
Copy link

GabeAl commented Apr 23, 2018

As for the difference between burst12 and burst15, burst12 is primarly intended for amplicon databases. It uses a much more RAM-friendly indexing scheme for small databases. For large (>4GB) databases, burst15 is recommended for speed.

As such, while the "edx" will work fine between the two versions, the "acx" is specific to one or the other (whichever version was used to make it).

@tanaes
Copy link
Author

tanaes commented Apr 23, 2018 via email

@bhillmann
Copy link
Collaborator

Were you able to solve your problem by rebuilding the database?

@H2CO3
Copy link

H2CO3 commented Apr 29, 2020

I ran into a similar issue. I wasn't able to get SHOGUN working with burst, since the latest official release of burst, v0.99.8, didn't even compile on my Linux machine (the source release contains syntax errors!).

So I installed bowtie2 and I ran SHOGUN with --aligner bowtie2. It kept crunching for about 18 minutes (htop was showing that the bowtie2 process was running), then I got the KeyError: 'summary' exception from Python. I don't know if bowtie2 segfaulted though.

@GabeAl
Copy link

GabeAl commented Apr 30, 2020 via email

@GabeAl
Copy link

GabeAl commented Apr 30, 2020 via email

@GabeAl
Copy link

GabeAl commented Apr 30, 2020

Also, what were the commands run to produce the database itself? Databases aren't compatible across major BURST releases.

@GabeAl
Copy link

GabeAl commented Apr 30, 2020

What's the difference, again, between burst12 and burst15? Does the database need to be reindexed for one vs the other?

Yes. DB15 and DB12 have fundamentally different database structures. Also, major releases of BURST (lettered are minor, numbered are major) also may have incompatibilities. I think this should be detected if an older database or a database made with a different DB version of BURST is used. I believe later versions of burst (i.e. newer than the 0.97 series) will do this detection automatically, but perhaps Shogun should implement this check in the wrapper first, or warn if pointing to a DB it knows it shipped with an earlier version.

DB12 is for low-RAM alignment. It is slower, and primarily intended for amplicons. Burst15 is for higher-RAM alignment and intended for shotgun. This is vaguely similar to the difference between bowtie2-align-s and bowtie2-align-l, which are also non-interchangeable, but the python wrapper "bowtie2" sorts out which should be called with which.

@H2CO3
Copy link

H2CO3 commented Apr 30, 2020

@GabeAl Hey, no, thank you for getting back to this!

Just to bring this in context, I'm familiar with building C code from source. It's not an unsupported assembly extension: the syntax error in particular I noticed was a missing closing curly brace here. After I added the closing curly on the next line, the compiler went ahead and complained about a type error here which is an assignment of a QPod to a value of type QPod *; judging from the surrounding code, it's probably a missing dereference. Then there is the redeclaration of numBins, RefCache and StCache here. I could imagine that the latter one is something the Intel compiler accepts. After I removed those, the code compiled just fine using -march=native with GCC 7.1. (I must admit, it might not do what it is doing under the Intel cc, though.)

I have since tried SHOGUN with the Linux binary downloadable from the same release (which advertises itself as burst15), with no success, unfortunately. Based on what several others suggested above, it might very well be that I simply don't have enough RAM; I'll be able to check this possibility soon, once I have access to a beefier machine. I have 8 GB in my Linux box, which seems to be close but no cigar.

The databases I didn't build myself, I simply downloaded the pre-built ones as suggested by the very last paragraph of this part of the README.

Cheers,
Árpád

@GabeAl
Copy link

GabeAl commented Apr 30, 2020

Thanks H2CO3!

Oh I see -- the current source indeed looks like it's for a WIP version and updates stopped after that. Later versions (completing the WIP, going into the 0.99.8 series, etc) must have never gotten pushed. I will push my local copy up.

Done. Let me know.

Cheerio,
Gabe

@H2CO3
Copy link

H2CO3 commented Apr 30, 2020

Awesome, thanks for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants