-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling coverage on a BAM re-fetches the header every time #56
Comments
@ah19 does this mean we just need to change that |
I believe so -- I verified it was making multiple requests by turning the htslib (and maybe curl) logging level up, and after making the change to I did only checked this was working against a single bam file, and I wasn't sure if there was a reason why it used |
Fair point. @rishidev is the best bet for knowing the answer if there was an intention to use |
Thanks for submitting and bumping this issue. It does raise an area which could be clarified. Firstly, header_read() over header was a definite decision that was made. The underlying issue is to do with BAM vs CRAM header objects being different HTSlib objects requiring different handling via a common mechanism. I believe the functions here need a fresh header object to work correctly. In HTS.xs, header_read() rewinds the header for BAM files (it is a file pointer, and the header is at the start of the file, so the structure is available by knowing the right place) but does a complete new header fetch for the CRAM files (it is an actual C structure). IIRC Using the cached version for the header caused issues for one, or both, of BAM and CRAM, and they needed resetting for correct functionality. The header_read() in fact used to be just header() - the name change was to reflect there was a matching header_write() and that in the CRAM case a definite read was made from the file. I'll try and go back over notes and provide further clarification. There is certainly scope for investigation here to make this work neater. |
A possible performance issue rather than a bug:
https://github.com/Ensembl/Bio-DB-HTS/blob/master/lib/Bio/DB/HTS.pm#L1951
Calling
$self->header
uses the cached versionThe text was updated successfully, but these errors were encountered: