-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suggested API #5
Comments
I agree with all, but I'll leave 2) as an exercise for someone else since it is a lot of work for what is likely to be a surprisingly small gain in speed (try it yourself with I already support using flags. See here: https://github.com/pyranges/bamread/blob/master/bamread/src/bamread.pyx#L17 |
I should also support writing bams, but I use bams ever so seldom that I've never gotten around to it (and I am not completely sure how to do it either). |
The current functions contain some extra stuff to make sure the data is valid for pyranges like
I guess I should create new similar ones to use for the standalone library so that it acts completely like pysam. |
(Just writing notes to self here) I guess we can alias
Worst case we need to change this to not be binary I think: |
@endrebak asked for suggestions on a possible API to make this a more general purpose read-my-bam tool.
My thought is a
read_bam
function that returns a pandas.DataFrame (similar to what is already here) but thatfields
. For examplefields=["Chromosome", "Start", "End", "Strand"]
would only read in the specified columns and return a DataFrame with only those columns. Similar tousecols
in pandas.read_csv.regions=[("chr1", 100, 10000)]
would subselect tochr1:100-10000
.only_mapped=True
would be the equivalent of passing-F 4
to samtools. I think really helpful to use named parameters here rather than making the user do bit arithmetic with binary flag codes. Basically implement this as named arguments.max_alignments
argument so the user can read just the first 10 records by passingmax_alignments=10
I think one function that implements this would handle the majority of my use cases for reading BAMs in Python, and provide a much simpler API to get started with and use than pysam
The text was updated successfully, but these errors were encountered: