Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent failure with nw_trace_scan_16 for larger sequences #46

Open
ifiddes opened this issue Nov 7, 2019 · 1 comment
Open

Silent failure with nw_trace_scan_16 for larger sequences #46

ifiddes opened this issue Nov 7, 2019 · 1 comment

Comments

@ifiddes
Copy link

ifiddes commented Nov 7, 2019

I tried aligning a 81kb sequence to a 101kb sequence with nw_trace_scan_16 and the resulting CIGAR is just 1X. If I repeat this with nw_trace_scan_32, it works fine. Is this intended behavior?

Is there any heuristics I can use to predict which bit size I can use based on the length of the sequences?

@jeffdaily
Copy link
Owner

You might try nw_trace_scan_sat, where "sat" stands for "saturate". It will try the _8 first, then _16, then _32. It will try to detect integer overflow during the calculation, where the bits for an integer fully saturate (all 1's). Unfortunately, if saturation is detected near the end of the _16 attempt, that's a lot of wasted work. So your performance might suffer.

Honestly, _32 should the widest bit you need for most alignments. The _64 will be quite slow because most CPU ISAs don't have a complete set of 64-bit integer vector operations so I'm forced to emulate them with something slower.

A good heuristic would be to estimate what the biggest score would be if your two sequences were to align perfectly. Perfectly would be max(length(A), length(B)) * match score. If that solution fits into a 16-bit integer, you're probably okay. A 16-bit integer can store up to 216 = 65536. A 32-bit integer can store up to 232 = 2147483647. Your longest sequence was 101,000 characters, already longer than a 16-bit value can store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants