Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Display ambiguous mutations in gene/nuc view tooltip #700

Closed
corneliusroemer opened this issue Jan 25, 2022 · 6 comments · Fixed by #1273
Closed

ENH: Display ambiguous mutations in gene/nuc view tooltip #700

corneliusroemer opened this issue Jan 25, 2022 · 6 comments · Fixed by #1273
Labels
package: nextclade_web package: nextclade prio:medium Do this soon t:feat Type: request of a new feature, functionality, enchancement

Comments

@corneliusroemer
Copy link
Member

Ambiguous nucleotides are currently displayed only in the tooltip of the corresponding column:
image

In gene view, ambiguous bases turn the corresponding AA into an X. It would be nice, if the X tooltip showed corresponding ambiguous nucleotides, like AA mutation tooltips show the mutated nucs.

Current:
image

Something like this would be cool, potentially even with a range of possible amino acids that is compatible with the ambiguity.
image

@corneliusroemer corneliusroemer added t:feat Type: request of a new feature, functionality, enchancement package: nextclade package: nextclade_web labels Jan 25, 2022
@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Jan 27, 2023

The translation of ambiguous nucs is a cool feature we discussed with Richard even in the early days of Nextclade. It can be implemented quite easily, by modifying the decoding map. Someone needs to actually build the map and put the additional triplet-to-aa entries above the catch-all case which resolves to AA X. It would be to error-prone to do manually, but perhaps one could take some known good Python lib with translation capabilities, and print the required Rust entries using a small script.

The coolness of this I see in that it will reduce the number of X, and salvage some amount of useful info even from mediocre data. Which is always nice and honorable, because sequencing data is precious.

Regarding showing how ambiguous nucs and AAs are linked - it's trickier. We currently don't track why a codon becomes an X and there are multiple reasons why a codon can do that. Also X are tracked as ranges, and within a range there could be multiple reasons by itself, so some deeper code changes will be needed.

@corneliusroemer
Copy link
Member Author

Just wanted to open a duplicate then found this: I was surprised that we don't indicate ambiguous nucleotides in the nuc view. Quite unexpected, I think we don't need to overcomplicate things: just add them like a single missing nuc.
image

@corneliusroemer corneliusroemer added the prio:medium Do this soon label Aug 31, 2023
@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Aug 31, 2023

@corneliusroemer showing nuc markers is easy. I can add that. What colors would you like to see? We already have 3 shades of gray for deletions, Ns and unsequenced. And if not gray, then I guess some random pale colors?

However calculating correspondence between aa and nucs is generally a more complex task. Plus, not all X come from ambiguous nucs.

We've improved our decoding recently to handle some of the ambiguities, so, hopefully, there should be less Xs coming from non-ACTGNs.

@corneliusroemer
Copy link
Member Author

Some shade of grey would be the obvious choice I think. Doesn't matter so much what it is, as long as there's something that can be hovered to see extra information.

I don't think we need to add anything related to AA there, just showing that there might be something in nuc view is good enough. Just putting the relevant entry from the table tooltip into the marker tooltip would be great:
image

While we're at it, it would be neat to maybe add what the various ambiguity codes stand for. This is a simple lookup:

R = A or G
K = G or T
S = G or C
Y = C or T
M = A or C
W = A or T
B = not A (C, G or T)
H = not G (A, C or T)
D = not C (A, G or T)
V = not T (A, C or G)

So the tooltip could read:

Ambiguous 
11514: Y (C or T)

@corneliusroemer
Copy link
Member Author

Would be great to add this @ivan-aksamentov - this would have really helped me some time investigating this case: yatisht/usher#354 - not showing ambiguous nucs/aa would be a big win for the sequence viewer. Color doesn't really matter, we can always tweak later.

ivan-aksamentov added a commit that referenced this issue Oct 18, 2023
Resolves: #700

Add colored markers for ambiguous nucleotide to nucleotide sequence view.

I tried to pick colors to be pale (so that they resemble pale grey of missing nucs) and to be a mixture of colors of corresponding nucs, or to be inverse color in case of "not" characters (e.g. color of not-A is a pale inverse of color of A).

I made them half-height at the top by default, just like missing nucs. (Settings UI is currently broken, so it's impossible to change at the moment, sorry)

Tooltip text is as specified in the original issue.
@ivan-aksamentov
Copy link
Member

@corneliusroemer prototype is here: #1273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package: nextclade_web package: nextclade prio:medium Do this soon t:feat Type: request of a new feature, functionality, enchancement
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants