Skip to content

Latest commit

 

History

History
281 lines (200 loc) · 8.15 KB

160516.md

File metadata and controls

281 lines (200 loc) · 8.15 KB
-------[p.182]-----------------------------

Association tests with CNV data

Copy Number Variation(CNV) : コピー数多型、集団中でコピー数が個人間で異なるゲノム領域を指す。

*通常:2コピー(1個の細胞で父方由来母方由来の2コピー存在す)
*少ない:1コピー(疾患関連に繋がる可能性)
*多い:3コピー以上

参考情報(油谷等JST報告書より)
*コピー数多型の概念図:http://www.jst.go.jp/pr/info/info361/zu1.html
*コピー数多型地図:http://www.jst.go.jp/pr/info/info361/zu2.html

参考CNVアレイ例
*http://www.tmd.ac.jp/mri/press/press31/index.html
*http://www.tokushima-u.ac.jp/_files/00222716/CytoScanWorkflow.pdf

参考CNV-NGS例 (Nord et al., 2011)
*http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088570/figure/F1/
*http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088570/figure/F3/
マニュアル
*https://cran.r-project.org/web/packages/CNVassocData/CNVassocData.pdf
*http://www.creal.cat/media/upload/arxius/jr/script_MLPA.R

> install.packages("CNVassoc")
Installing package into ‘C:/Users/ekaminuma/Documents/R/win-library/3.2’
(as ‘lib’ is unspecified)
also installing the dependencies ‘CNVassocData’, ‘mixdist’, ‘mclust’

downloaded 466 KB

package ‘CNVassocData’ successfully unpacked and MD5 sums checked
package ‘mixdist’ successfully unpacked and MD5 sums checked
package ‘mclust’ successfully unpacked and MD5 sums checked
package ‘CNVassoc’ successfully unpacked and MD5 sums checked

> library(CNVassoc)
 要求されたパッケージ CNVassocData をロード中です 
 要求されたパッケージ mixdist をロード中です 
 要求されたパッケージ mclust をロード中です 
    __  ___________    __  _____________
   /  |/  / ____/ /   / / / / ___/_  __/
  / /|_/ / /   / /   / / / /\__ \ / /   
 / /  / / /___/ /___/ /_/ /___/ // /    
/_/  /_/\____/_____/\____//____//_/    version 5.2
Type 'citation("mclust")' for citing this R package in publications.
 要求されたパッケージ survival をロード中です 
> 
> data(dataMLPA)

<参考>
http://finzi.psych.upenn.edu/library/CNVassocData/html/dataMLPA.html

id 	 The unique identifiers of individuals 
casco 	 Case-control stauts 0:control 1:case 
Gene1 	 Intensities for Gene1 
Gene2 	 Intensities for Gene2 
PCR.Gene1 	 True copy number status for Gene1
PCR.Gene2 	 True copy number status for Gene2
quanti 	 Simulated continuos variable. 
cov 	 Simulated continuous variable. 



> str(dataMLPA)
'data.frame':	651 obs. of  8 variables:
 $ id       : Factor w/ 346 levels "H238","H239",..: 1 1 2 2 3 3 4 4 5 5 ...
 $ casco    : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Gene1    : num  0.51 0.45 0 0 0 0 0.23 0.26 0 0 ...
 $ Gene2    : num  0.539 0.639 0.483 0.464 0 ...
 $ PCR.Gene1: Factor w/ 3 levels "del","ht","wt": 3 3 1 1 1 1 2 2 1 1 ...
 $ PCR.Gene2: Factor w/ 3 levels "del","ht","wt": 3 3 3 3 1 1 1 1 1 1 ...
 $ quanti   : num  -0.61 -0.13 -0.57 -1.4 0.83 -2.07 -1.68 -1.4 1.09 0.55 ...
 $ cov      : num  10.83 10.69 9.63 9.87 10.25 ...
> 
> head(dataMLPA)
    id casco Gene1     Gene2 PCR.Gene1 PCR.Gene2 quanti   cov
1 H238     1  0.51 0.5385080        wt        wt  -0.61 10.83
2 H238     1  0.45 0.6392029        wt        wt  -0.13 10.69
3 H239     1  0.00 0.4831572       del        wt  -0.57  9.63
4 H239     1  0.00 0.4640072       del        wt  -1.40  9.87
5 H276     1  0.00 0.0000000       del       del   0.83 10.25
6 H276     1  0.00 0.0000000       del       del  -2.07 10.40
 

> dataMLPA

:

638 V962     0  0.05 0.42722907       del        wt   1.31  9.18
639 V962     0  0.05 0.28655589       del        ht  -0.34 10.55
640 V965     0  0.00 0.47746664       del        wt   1.36  8.52
641 V965     0  0.00 0.61591204       del        wt   1.39 10.41
642 V966     0  0.00 0.53093585       del        wt  -1.45 10.04
643 V966     0  0.00 0.63121841       del        wt   1.18  9.07
644 V970     0  0.26 0.46568148        ht        wt  -1.13  8.49
645 V970     0  0.26 0.36433636        ht        ht   0.85  9.51
646 X067     1  0.22 0.00000000        ht       del   2.13 10.27
647 X067     1  0.24 0.00000000        ht       del  -1.71  9.51
648 X105     1  0.00 0.35324276       del        ht   1.48  8.68
649 X105     1  0.00 0.51050351       del        wt   0.28 10.73
650 X106     1  0.23 0.19154860        ht        ht   1.89  8.08
651 X106     1  0.23 0.22994709        ht        ht  -0.81 10.44

> 

> dataMLPA$Gene1
  [1] 0.51 0.45 0.00 0.00 0.00 0.00 0.23 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00
 [16] 0.00 0.22 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.27 0.28 0.26 0.23 0.00 0.00
 [31] 0.00 0.00 0.00 0.00 0.00 0.00 0.22 0.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00
 [46] 0.00 0.00 0.29 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.26 0.00


> plotSignal(dataMLPA$Gene1,case.control=dataMLPA$casco)





> par(mfrow=c(1,2),mar=c(3,4,3,1))
> hist(dataMLPA$Gene1,main="gene 1 histogram",xlab="",ylab="freq")
> hist(dataMLPA$Gene2,main="gene 2 histogram",xlab="",ylab="freq")


> ?cnv

> myCNV <- cnv(x=dataMLPA$Gene2, threshold.0 = 0.01, mix.method ="mixdist" )



phenotypeとassociation解析

> mod <- CNVassoc(formula=casco~myCNV, data=dataMLPA, model="mul")
> 
> mod

Call:  CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul") 

Coefficients:
         CNV0        CNV1        CNV2      
CNVmult   1.0520923   0.3122567  -0.0970782

Number of individuals: 651 
Number of estimated parameters: 3 
Deviance: 876.396 
> 
> summary(mod)

Call:
CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul")

Deviance: 876.396 
Number of parameters: 3 
Number of individuals: 651 

Coefficients:
          OR lower.lim upper.lim      SE    stat pvalue
CNV0  1.0000                                           
CNV1  0.4772    0.2742    0.8304  0.2827 -2.6172  0.009
CNV2  0.3169    0.1834    0.5477  0.2791 -4.1169  0.000

(Dispersion parameter for  binomial  family taken to be  1 )


Covariance between coefficients:
     CNV0    CNV1    CNV2   
CNV0  0.0613  0.0000  0.0000
CNV1          0.0186 -0.0032
CNV2                  0.0166



> mod2 <- CNVassoc(formula=casco~myCNV, data=dataMLPA, model="add")
> 
> 
> anova(mod,mod2)


--- Likelihood ratio test comparing 2 CNVassoc models:


Model 1 call:  CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "mul") 

Model 2 call:  CNVassoc(formula = casco ~ myCNV, data = dataMLPA, model = "add") 

Chi= 0.6645798 (df= 1 )  p-value= 0.4149477 

  Note: the 2 models must be nested, and this function doesn't check this!
> 
> logLik(mod)
  logLik       df 
-438.198    3.000 


CNV associationが有意か検定
LRT: 尤度比検定
Wald: Wald検定

> CNVtest(mod,type="LRT")
----CNV Likelihood Ratio Test----
Chi= 18.75453 (df= 2 ) , pvalue= 8.462633e-05 


> CNVtest(mod,type=c("Wald","LRT"))
----CNV Wald test----
Chi= 17.32966 (df= 2 ) , pvalue= 0.0001725492 

Warning messages:
1: In if (type.test == 1) { :
  the condition has length > 1 and only the first element will be used
2: In if (x$type == 1) cat("----CNV Wald test----\n") else cat("----CNV Likelihood Ratio Test----\n") :
  the condition has length > 1 and only the first element will be used
> 
> CNVtest(mod,type="Wald")
----CNV Wald test----
Chi= 17.32966 (df= 2 ) , pvalue= 0.0001725492 

> plotSignal(dataMLPA$Gene2, caes.control=dataMLPA$casco)



*その他のCNV package

patchwork package: 
allele-specific copy number analysis

https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-3-r24



======[p.185]================
Visualizations in GWAS studies

> library(GWASTools)
> library(SNPassoc)

> 
> data(SNPs)
> 
> mySNP<-setupSNP(data=SNPs,colSNPs=6:40,sep="")
> 
> myres <- WGassociation(protein, data=mySNP, model="all")

> pvals <- dominant(myres)
> 
> qqPlot(pvals)
Error: could not find function "qqPlot"

----------------------

> install.packages("qqman")
Installing package into ‘C:/Users
> library(qqman)

> head(gwasResults)
  SNP CHR BP         P
1 rs1   1  1 0.9148060
2 rs2   1  2 0.9370754
3 rs3   1  3 0.2861395
4 rs4   1  4 0.8304476
5 rs5   1  5 0.6417455
6 rs6   1  6 0.5190959
> 
> 
> 
> manhattan(gwasResults)
> 


> 
> 
> qq(gwasResults$P, main="Q-Q plot of P-values")



qqplot参考
http://qiita.com/kenmatsu4/items/59605dc745707e8701e0