You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried aligning a 81kb sequence to a 101kb sequence with nw_trace_scan_16 and the resulting CIGAR is just 1X. If I repeat this with nw_trace_scan_32, it works fine. Is this intended behavior?
Is there any heuristics I can use to predict which bit size I can use based on the length of the sequences?
The text was updated successfully, but these errors were encountered:
You might try nw_trace_scan_sat, where "sat" stands for "saturate". It will try the _8 first, then _16, then _32. It will try to detect integer overflow during the calculation, where the bits for an integer fully saturate (all 1's). Unfortunately, if saturation is detected near the end of the _16 attempt, that's a lot of wasted work. So your performance might suffer.
Honestly, _32 should the widest bit you need for most alignments. The _64 will be quite slow because most CPU ISAs don't have a complete set of 64-bit integer vector operations so I'm forced to emulate them with something slower.
A good heuristic would be to estimate what the biggest score would be if your two sequences were to align perfectly. Perfectly would be max(length(A), length(B)) * match score. If that solution fits into a 16-bit integer, you're probably okay. A 16-bit integer can store up to 216 = 65536. A 32-bit integer can store up to 232 = 2147483647. Your longest sequence was 101,000 characters, already longer than a 16-bit value can store.
I tried aligning a 81kb sequence to a 101kb sequence with
nw_trace_scan_16
and the resulting CIGAR is just1X
. If I repeat this withnw_trace_scan_32
, it works fine. Is this intended behavior?Is there any heuristics I can use to predict which bit size I can use based on the length of the sequences?
The text was updated successfully, but these errors were encountered: