Project completed during my MSc in Bioinformatics at the University of Birmingham, United Kingdom, 2019-2020.
- QC & FILTERING
- pass threshold lod≥3
- exclude missing variants
- exclude multi-allelic variants
- exclude low no of reads - only keep ≥30
- exclude low PHRED scores - only keep≥20
- uses vcfR R package
- EXTRACT GENOTYPE (reference allele vs. alternative allele e.g. 0/1)
- EXTRACT DNA BASES (reference allele vs. alternative allele)
- FIND THE CHILD (count the no of incompatible variants)
- EXTRACT DE NOVO VARIANTS FROM THE CHILD
- FURTHER RESEARCH
- validation of de novo variants
- computational e.g. calculate the probability of the mutation to be present in a family trio
- experimental e.g. SANGER sequencing
- clinical impact of de novo varints (UCSC Genome Browser)
- validation of de novo variants