Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hemizygous count is not displayed in varfish variant detail #1836

Closed
xiamaz opened this issue Jul 24, 2024 · 18 comments · Fixed by #1860
Closed

hemizygous count is not displayed in varfish variant detail #1836

xiamaz opened this issue Jul 24, 2024 · 18 comments · Fixed by #1860
Assignees
Labels
bug Something isn't working

Comments

@xiamaz
Copy link
Collaborator

xiamaz commented Jul 24, 2024

Describe the bug
For X-chromosomal variants the number of hemizygous individuals is not displayed in the variant details.

To Reproduce
Steps to reproduce the behavior:

  1. Filter for X-chromosomal variants.
  2. Click on any variant result

Expected behavior
For X-chromosomal variants the count of heterozygous, hemizygous and homozygous individuals should be shown.

@xiamaz xiamaz added the bug Something isn't working label Jul 24, 2024
@xiamaz
Copy link
Collaborator Author

xiamaz commented Jul 24, 2024

Workaround

Using the gnomAD linkout the number of hemizygous individuals can be directly seen on the official webpage.

@stolpeo stolpeo self-assigned this Jul 25, 2024
@stolpeo
Copy link
Contributor

stolpeo commented Jul 26, 2024

@xiamaz On their website, gnomAD does not list heterozygotes. Is such a column really required?

On another note, I noticed that by default gnomAD exome and genome counts are added to a single number per population on their website by default. One can remove the counts from one or the other via a checkbox. I wondered if it would make the frequency information simpler by providing a combined count as well.

@stolpeo
Copy link
Contributor

stolpeo commented Jul 26, 2024

annonars currently does not return this number: varfish-org/annonars#506

@stolpeo
Copy link
Contributor

stolpeo commented Jul 29, 2024

@xiamaz I had a conversation with @holtgrewe who mentioned that the number of homozygotes in males equals the number of hemizygotes in the X chromosome. Would that information be sufficient?

@xiamaz
Copy link
Collaborator Author

xiamaz commented Jul 29, 2024

How would you implement that? I'm unsure I can follow

@holtgrewe
Copy link
Collaborator

@xiamaz outside of PAR, the number of XY carriers equal the number of hemizygous carriers

@xiamaz
Copy link
Collaborator Author

xiamaz commented Jul 29, 2024

Outside PAR this is correct.

@stolpeo
Copy link
Contributor

stolpeo commented Jul 29, 2024

Sorry I didn't indicate my intention. I was to ask if the information is sufficient so I don't implement anything. I don't know how important the number for the PAR region is.

@holtgrewe
Copy link
Collaborator

@xiamaz inside the PAR, things are as follows:

  • the variant analysis pipelines that I know will NNN-mask the PAR on chrY: (hs37(d5)), hs38, GATK analysis sets
  • thus, inside the PAR, XY karyotype carriers can have hom ref, het, hom alt
  • hemi does not occur inside the PAR from NGS data
  • only long-read sequencing and phasing can tell you about chrX/chrY PAR

@xiamaz
Copy link
Collaborator Author

xiamaz commented Jul 30, 2024

In the user interface please explicitly list the count of hemizygous cases. Whether the gene is inside or outside of PAR is easy to forget for an investigator and takes too much mental bandwidth.

@stolpeo
Copy link
Contributor

stolpeo commented Jul 30, 2024

OK we then need to complete varfish-org/annonars#506 first. Is the information available somewhere whether the variant lies within a PAR or not and if so, would it be valuable to also display this information?

@holtgrewe
Copy link
Collaborator

holtgrewe commented Jul 31, 2024

Specification

  • we need to change the label "European (North-Western)" to "European (Non-Finish)"
  • handle special cases of chrX and chrY, the remaining chromosomes remain as is
  • for both chrX and chrY
    • display a warning if we are inside the PAR (take coordinates from Wikipedia)
    • add column hemizygous
    • if inside the PAR, display as if on autosome (chr1...chr22), hemizygous == 0, else:
    • you will find information as below. Display the ac count of xy individuals as hemizygous, equivalently for bySex/overall
                  {
                    "population": "eas",
                    "counts": {
                      "overall": {
                        "ac": 1,
                        "an": 13860,
                        "af": 0.000072150098276324
                      },
                      "xx": {
                        "an": 9328
                      },
                      "xy": {
                        "ac": 1,
                        "an": 4532,
                        "af": 0.000220653004362247
                      }
                    },
                  },
      

Overall, we should correspond to gnomAD

@xiamaz
Copy link
Collaborator Author

xiamaz commented Aug 1, 2024

Discussion with @stolpeo:

@xiamaz I see issues with reimplementing this logic in VarFish, as these counts are already available in the gnomAD data and can just be passed through. @tedil How much work is it to extend annonars to support hemizygous counts? If this is too much work, we should consider the alternative workaround.

@tedil
Copy link

tedil commented Aug 1, 2024

About this ↑ much work to include the respective fields in the data sources -- for gnomad2 SV only, that is. Not sure what has to happen downstream from that, though.

@holtgrewe
Copy link
Collaborator

@xiamaz could you please explain why you assume the "hemi" data is in gnomAD?

Proof me wrong but

  • gnomad has info on AC AN AF and nhomalt counts overall and for male/female XY/XX individuals only
  • they compute the hemi counts on their website

@tedil
Copy link

tedil commented Aug 1, 2024

@xiamaz could you please explain why you assume the "hemi" data is in gnomAD?

Proof me wrong but

* gnomad has info on AC AN AF and nhomalt counts overall and for male/female XY/XX individuals only

* they compute the hemi counts on their website

Correct, the hemizygous counts and freqs are only specified in the gnomAD_SV datasets.

@stolpeo
Copy link
Contributor

stolpeo commented Aug 1, 2024

I also assumed that the data is in gnomAD because it is used on their website. However, here is the statement that they do not include the hemizygot counts anymore: https://gnomad.broadinstitute.org/news/2018-10-gnomad-v2-1/#hemizygote-counts

Hemizygote counts

For this release, there are no more hemizygote count annotations. Instead, our male counts correctly consider males as hemizygous in non-pseudo-autosomal (non-PAR) regions of the sex chromosomes and we provide a nonpar flag to identify all variants that are in the these regions.

@xiamaz
Copy link
Collaborator Author

xiamaz commented Aug 1, 2024

In that case doing this conversion is the best way. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants