Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secondary alignment tracking for anvi-profile #2371

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

meren
Copy link
Member

@meren meren commented Nov 9, 2024

This PR adds functionality to track secondary alignments in BAM files for accurate reporting of coverages if the user requested from the mapping software for individual reads to be aligned to multiple places.

Partially addressing the feature request by @FlorianTrigodet in #2369.

In its current state, the PR is partially complete as there is a known bug in it that in some cases leads to the following error when profiling SNVs:

Traceback (most recent call last):
  File "/Users/meren/github/anvio/bin/anvi-profile", line 130, in <module>
    main(args)
  File "/Users/meren/github/anvio/anvio/terminal.py", line 926, in wrapper
    program_method(*args, **kwargs)
  File "/Users/meren/github/anvio/bin/anvi-profile", line 38, in main
    profiler.BAMProfiler(args)._run()
  File "/Users/meren/github/anvio/anvio/profiler.py", line 593, in _run
    self.profile_single_thread()
  File "/Users/meren/github/anvio/anvio/profiler.py", line 1066, in profile_single_thread
    contig = self.process_contig(bam_file, contig_name, contig_length)
  File "/Users/meren/github/anvio/anvio/profiler.py", line 1019, in process_contig
    split.auxiliary.process(bam_file)
  File "/Users/meren/github/anvio/anvio/contigops.py", line 205, in process
    self.run_SNVs_and_indels(bam)
  File "/Users/meren/github/anvio/anvio/contigops.py", line 485, in run_SNVs_and_indels
    aligned_sequence_as_ord, reference_positions = read.get_aligned_sequence_and_reference_positions()
  File "/Users/meren/github/anvio/anvio/bamops.py", line 557, in get_aligned_sequence_and_reference_positions
    return _get_aligned_sequence_and_reference_positions(
ValueError: cannot assign slice from input of different size

Solution for this requires insights into whether we should carry over more information into the read instance for the secondary alignment for the primary alignment when we update the 'sequence' information at these locations:

        (...)
        for read in self.fetch(contig_name, start, end):
            if self.secondary_alignments_tracked and read.is_secondary:
                read.query_sequence = self.primary_alignments[read.query_name]
        (...)

We update query sequence, but probably we need to update one or more things to address the SNVs error. Until then, using the flag --skip-SNV-profiling along with --track-secondary-alignments is the only option to avoid catastrophic failures during the profiling of secondary alignments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant