-
Notifications
You must be signed in to change notification settings - Fork 17
Reference files from normal sample data
This provides an example of how reference files can be generated from data files when no public SNP set is available.
The code has been tested using crossed mouse Illumina paired-end sequencing mapped with BWA-mem. The result was compared against the Mouse Genomes Project VCF based reference described here.
For each sample file you scan for high confidence non-reference locations.
Please see ascatSnpPanelGeneration.pl -h
for all options.
$ ascatSnpPanelGeneration.pl -ref genome.fa -hf sampleA.bam > sampleA-hets.tsv.0
(output filename must end with a number .N
)
We recommend 10 samples as a minimum when performing this.
Once a set of outputs from the previous step are generated the following command will determine the common HET/HOM loci to be used in the final panel.
- A HET SNP is generated if it exists in >66% of samples.
- A HOM SNP is generated if it exists in >33% of samples.
- Locations with more than 2 alleles expressed across the panel are excluded.
- Locations within 500bp of another potential SNP are excluded.
Basic usage is:
$ ascatSnpPanelMerge.pl genome.fa sampleA-hets.tsv.0 [sampleB-hets.tsv.0] > SnpPositions.tsv
Run with no options for additional information.