Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix end position for SV insertion types #1776

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nakib103
Copy link
Contributor

#1432

Currently for SV insertions, the end is calculated like so -

$end = $start + SVLEN - 1
or, 
$end = $END // from INFO field

But it is probably more appropriate to say -

$end = $start -1

Because that is the anchoring bases in reference sequence the sequence is inserted. The current implementation adding the SVLEN/END implied a wider length of reference sequence getting affected which is actually not the case.

PS: it will change the consequence, for example the test variant - 1 146985757 . G <INS> . . SVTYPE=INS;SVLEN=17243
previously,

#Uploaded_variation	Location	Allele	Gene	Feature	Feature_type	Consequence	cDNA_position	CDS_position	Protein_position	Amino_acids	Codons	Existing_variation	Extra
1_146985758_<INS>	1:146985758-147003000	insertion	ENSG00000234225	ENST00000425109	Transcript	non_coding_transcript_exon_variant,intron_variant	230-?	-	-	-	-	-	IMPACT=MODIFIER;STRAND=-1;OverlapBP=1070;OverlapPC=63.39
1_146985758_<INS>	1:146985758-147003000	insertion	ENSG00000234225	ENST00000455718	Transcript	non_coding_transcript_exon_variant,intron_variant	185-?	-	-	-	-	-	IMPACT=MODIFIER;STRAND=-1;OverlapBP=719;OverlapPC=79.62
1_146985758_<INS>	1:146985758-147003000	insertion	ENSG00000268043	ENST00000617931	Transcript	coding_sequence_variant,3_prime_UTR_variant,intron_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1;OverlapBP=10445;OverlapPC=18.18
1_146985758_<INS>	1:146985758-147003000	insertion	ENSG00000268043	ENST00000698835	Transcript	coding_sequence_variant,3_prime_UTR_variant,intron_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1;OverlapBP=10441;OverlapPC=18.04
1_146985758_<INS>	1:146985758-147003000	insertion	ENSG00000268043	ENST00000698937	Transcript	coding_sequence_variant,3_prime_UTR_variant,intron_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1;OverlapBP=10441;OverlapPC=18.13

With this fix -

#Uploaded_variation	Location	Allele	Gene	Feature	Feature_type	Consequence	cDNA_position	CDS_position	Protein_position	Amino_acids	Codons	Existing_variation	Extra
1_146985758_insertion	1:146985757-146985758	insertion	ENSG00000268043	ENST00000617931	Transcript	intergenic_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1
1_146985758_insertion	1:146985757-146985758	insertion	ENSG00000268043	ENST00000698835	Transcript	intergenic_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1
1_146985758_insertion	1:146985757-146985758	insertion	ENSG00000268043	ENST00000698937	Transcript	intergenic_variant	-	-	-	-	-	-	IMPACT=MODIFIER;STRAND=1

@nakib103 nakib103 marked this pull request as draft October 22, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant