Association tests with CNV data
Copy Number Variation(CNV) : コピー数多型、集団中でコピー数が個人間で異なるゲノム領域を指す。
参考CNV-NGS例 (Nord et al., 2011)
> install.packages("CNVassoc")
> library(CNVassoc)
> data(dataMLPA)
id The unique identifiers of individuals
casco Case-control stauts 0:control 1:case
Gene1 Intensities for Gene1
Gene2 Intensities for Gene2
PCR.Gene1 True copy number status for Gene1
PCR.Gene2 True copy number status for Gene2
quanti Simulated continuos variable.
cov Simulated continuous variable.
> str(dataMLPA)
'data.frame': 651 obs. of 8 variables:
$ id : Factor w/ 346 levels "H238","H239",..: 1 1 2 2 3 3 4 4 5 5 ...
$ casco : int 1 1 1 1 1 1 1 1 1 1 ...
$ Gene1 : num 0.51 0.45 0 0 0 0 0.23 0.26 0 0 ...
$ Gene2 : num 0.539 0.639 0.483 0.464 0 ...
$ PCR.Gene1: Factor w/ 3 levels "del","ht","wt": 3 3 1 1 1 1 2 2 1 1 ...
$ PCR.Gene2: Factor w/ 3 levels "del","ht","wt": 3 3 3 3 1 1 1 1 1 1 ...
$ quanti : num -0.61 -0.13 -0.57 -1.4 0.83 -2.07 -1.68 -1.4 1.09 0.55 ...
$ cov : num 10.83 10.69 9.63 9.87 10.25 ...
> head(dataMLPA)
id casco Gene1 Gene2 PCR.Gene1 PCR.Gene2 quanti cov
1 H238 1 0.51 0.5385080 wt wt -0.61 10.83
2 H238 1 0.45 0.6392029 wt wt -0.13 10.69
3 H239 1 0.00 0.4831572 del wt -0.57 9.63
4 H239 1 0.00 0.4640072 del wt -1.40 9.87
5 H276 1 0.00 0.0000000 del del 0.83 10.25
6 H276 1 0.00 0.0000000 del del -2.07 10.40
> dataMLPA
> dataMLPA$Gene1
> plotSignal(dataMLPA$Gene1,case.control=dataMLPA$casco)
> par(mfrow=c(1,2),mar=c(3,4,3,1))
> hist(dataMLPA$Gene1,main="gene 1 histogram",xlab="",ylab="freq")
> hist(dataMLPA$Gene2,main="gene 2 histogram",xlab="",ylab="freq")
> ?cnv
> myCNV <- cnv(x=dataMLPA$Gene2, threshold.0 = 0.01, mix.method ="mixdist" )
> mod <- CNVassoc(formula=casco~myCNV, data=dataMLPA, model="mul")
> mod
Call: CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul")
CNVmult 1.0520923 0.3122567 -0.0970782
Number of individuals: 651
Number of estimated parameters: 3
Deviance: 876.396
> summary(mod)
CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul")
Deviance: 876.396
Number of parameters: 3
Number of individuals: 651
OR lower.lim upper.lim SE stat pvalue
CNV0 1.0000
CNV1 0.4772 0.2742 0.8304 0.2827 -2.6172 0.009
CNV2 0.3169 0.1834 0.5477 0.2791 -4.1169 0.000
(Dispersion parameter for binomial family taken to be 1 )
Covariance between coefficients:
CNV0 0.0613 0.0000 0.0000
CNV1 0.0186 -0.0032
CNV2 0.0166
> mod2 <- CNVassoc(formula=casco~myCNV, data=dataMLPA, model="add")
> anova(mod,mod2)
--- Likelihood ratio test comparing 2 CNVassoc models:
Model 1 call: CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul")
Model 2 call: CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "add")
Chi= 0.6645798 (df= 1 ) p-value= 0.4149477
Note: the 2 models must be nested, and this function doesn't check this!
> logLik(mod)
logLik df
-438.198 3.000
CNV associationが有意か検定
LRT: 尤度比検定
Wald: Wald検定
> CNVtest(mod,type="LRT")
----CNV Likelihood Ratio Test----
Chi= 18.75453 (df= 2 ) , pvalue= 8.462633e-05
> CNVtest(mod,type=c("Wald","LRT"))
----CNV Wald test----
Chi= 17.32966 (df= 2 ) , pvalue= 0.0001725492
Warning messages:
1: In if (type.test == 1) { :
the condition has length > 1 and only the first element will be used
2: In if (x$type == 1) cat("----CNV Wald test----\n") else cat("----CNV Likelihood Ratio Test----\n") :
the condition has length > 1 and only the first element will be used
> CNVtest(mod,type="Wald")
----CNV Wald test----
Chi= 17.32966 (df= 2 ) , pvalue= 0.0001725492
> plotSignal(dataMLPA$Gene2, caes.control=dataMLPA$casco)
*その他のCNV package
patchwork package:
allele-specific copy number analysis
Visualizations in GWAS studies
> library(GWASTools)
> library(SNPassoc)
> data(SNPs)
> mySNP<-setupSNP(data=SNPs,colSNPs=6:40,sep="")
> myres <- WGassociation(protein, data=mySNP, model="all")
> pvals <- dominant(myres)
> qqPlot(pvals)
Error: could not find function "qqPlot"
> install.packages("qqman")
> library(qqman)
> head(gwasResults)
1 rs1 1 1 0.9148060
2 rs2 1 2 0.9370754
3 rs3 1 3 0.2861395
4 rs4 1 4 0.8304476
5 rs5 1 5 0.6417455
6 rs6 1 6 0.5190959
> manhattan(gwasResults)
> qq(gwasResults$P, main="Q-Q plot of P-values")