-
Notifications
You must be signed in to change notification settings - Fork 17
Mouse reference files from Mouse Genome Project VCFs
Keiran Raine edited this page Jul 22, 2016
·
2 revisions
Here we provide an example how to generate a SNP panel for mouse using the Mouse Genomes Project VCF files.
A tool has been created to assist with this:
ascatSnpPanelFromVcfs.pl snps.vcf[.gz] > SnpPositions.tsv
This script was created to generate a shared SNP panel for species having
multiple strains which have become homozygous through in-breeding.
The resulting output is likely to only be useful in experiments where crossing
of strains has been performed to add heterozygous SNPs back into the population.
The initial use has been against the Mouse Genome Project outputs found here:
ftp://ftp-mouse.sanger.ac.uk/REL-*-SNPs_Indels/mgp.*.merged.snps_all.*.vcf.gz
As not all VCF files are created equal you may need to make modifications for
other sources.
This example is for Mouse GRCm38 using data from the mouse genomes project.
Download an appropriate SNP dataset. In this case we want the merged data to ensure we include SNPs from multiple stains:
$ wget ftp://ftp-mouse.sanger.ac.uk/REL-1505-SNPs_Indels/mgp.v5.merged.snps_all.dbSNP142.vcf.gz
OR
$ curl -sSL ftp://ftp-mouse.sanger.ac.uk/REL-1505-SNPs_Indels/mgp.v5.merged.snps_all.dbSNP142.vcf.gz > mgp.v5.merged.snps_all.dbSNP142.vcf.gz
Now use a script to create the relevant files:
$ ascatSnpPanelFromVcfs.pl mgp.v5.merged.snps_all.dbSNP142.vcf.gz | grep -v '^MT' > SnpPositions.tsv