-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathindex.Rmd
212 lines (158 loc) · 7.78 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
---
output: github_document
bibliography: inst/REFERENCES.bib
---
```{r, echo = FALSE, message=F}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
dev = "png",
dpi = 500,
fig.align = "center",
knitr::opts_chunk$set(comment = NA)
)
library(ggplot2)
library(BGGM)
```
# Bayesian Gaussian Graphical Models <img src="man/figures/logo.png" align="right" alt="" width="150" />
<!-- badges: start -->
[![CRAN Version](http://www.r-pkg.org/badges/version/BGGM)](https://cran.r-project.org/package=BGGM)
[![Downloads](https://cranlogs.r-pkg.org/badges/BGGM)](https://cran.r-project.org/package=BGGM)
[![Build Status](https://travis-ci.org/donaldRwilliams/BGGM.svg?branch=master)](https://travis-ci.org/donaldRwilliams/BGGM)
<!-- badges: end -->
The `R` package **BGGM** provides tools for making Bayesian inference in
Gaussian graphical models [GGM, @williams2020bggm]. The methods are organized around
two general approaches for Bayesian inference: (1) estimation and (2) hypothesis
testing. The key distinction is that the former focuses on either the posterior or posterior
predictive distribution [@Gelman1996a; see section 5 in @rubin1984bayesianly], whereas the
latter focuses on model comparison with the Bayes factor [@Jeffreys1961; @Kass1995].
## <i class="fas fa-cog"></i> Installation
To install the latest release version (`2.1.1`) from CRAN use
```{r gh-installation, eval = FALSE}
install.packages("BGGM")
```
The current developmental version can be installed with
```{r, eval = FALSE}
if (!requireNamespace("remotes")) {
install.packages("remotes")
}
remotes::install_github("donaldRwilliams/BGGM")
```
### <i class="fas fa-skull-crossbones"></i> Dealing with Errors
There are automatic checks for [**BGGM**](https://travis-ci.org/github/donaldRwilliams/BGGM/branches). However, that only checks for Linux and **BGGM** is built on Windows.
The most common installation errors occur on OSX. An evolving guide to address these
issues is provided in the [Troubleshoot Section](https://donaldrwilliams.github.io/BGGM/articles/installation.html).
## <i class="fas fa-clipboard-list"></i> Overview
The methods in **BGGM** build upon existing algorithms that are well-known in the literature.
The central contribution of **BGGM** is to extend those approaches:
1. Bayesian estimation with the novel matrix-F prior distribution [@Mulder2018]
+ Estimation [@Williams2019]
2. Bayesian hypothesis testing with the matrix-F prior distribution [@Williams2019_bf]
+ [Exploratory hypothesis testing](https://github.com/donaldRwilliams/BGGM#Exploratory)
+ [Confirmatory hypothesis testing](https://github.com/donaldRwilliams/BGGM#confirmatory)
3. Comparing Gaussian graphical models [@Williams2019; @williams2020comparing]
+ [Partial correlation differences](https://github.com/donaldRwilliams/BGGM#partial-correlation-differences)
+ [Posterior predictive check](https://github.com/donaldRwilliams/BGGM#posterior-predictive-check)
+ [Exploratory hypothesis testing](https://github.com/donaldRwilliams/BGGM#exploratory-groups)
+ [Confirmatory hypothesis testing](https://github.com/donaldRwilliams/BGGM#confirmatory-groups)
4. Extending inference beyond the conditional (in)dependence structure [@Williams2019]
+ [Predictability](https://github.com/donaldRwilliams/BGGM#predictability)
+ [Posterior uncertainty intervals](https://github.com/donaldRwilliams/BGGM#posterior-uncertainty) for the
partial correlations
+ [Custom Network Statistics](https://github.com/donaldRwilliams/BGGM#custom-network-statistics)
The computationally intensive tasks are written in `c++` via the `R` package **Rcpp** [@eddelbuettel2011rcpp] and the `c++` library **Armadillo** [@sanderson2016armadillo]. The Bayes factors are computed with the `R` package **BFpack** [@mulder2019bfpack]. Furthermore, there are plotting functions
for each method, control variables can be included in the model (e.g., `~ gender`),
and there is support for missing values (see `bggm_missing`).
## Supported Data Types
* **Continuous**: The continuous method was described in @Williams2019. Note that
this is based on the customary [Wishart distribution](https://en.wikipedia.org/wiki/Wishart_distribution).
* **Binary**: The binary method builds directly upon @talhouk2012efficient
that, in turn, built upon the approaches of @lawrence2008bayesian and
@webb2008bayesian (to name a few).
* **Ordinal**: The ordinal methods require sampling thresholds. There are two approach
included in **BGGM**. The customary approach described in @albert1993bayesian
(the default) and the 'Cowles' algorithm described in @cowles1996accelerating.
* **Mixed**: The mixed data (a combination of discrete and continuous) method was introduced
in @hoff2007extending. This is a semi-parametric copula model
(i.e., a copula GGM) based on the ranked likelihood. Note that this can be used for
*only* ordinal data (not restricted to "mixed" data).
## <i class="fas fa-folder-open"></i> Illustrative Examples
There are several examples in the [Vignettes](https://donaldrwilliams.github.io/BGGM/articles/) section.
## <i class="fas fa-play-circle"></i> Basic Usage
It is common to have some combination of continuous and discrete (e.g., ordinal, binary, etc.) variables. **BGGM** (as of version `2.0.0`) can readily be used for these kinds of data. In this example, a model is fitted for the `gss` data in **BGGM**.
### Visualize
The data are first visualized with the **psych** package, which readily shows the data are "mixed".
```r
# dev version
library(BGGM)
library(psych)
# data
Y <- gss
# histogram for each node
psych::multi.hist(Y, density = FALSE)
```
![](man/figures/index_hist.png)
### Fit Model
A Gaussian copula graphical model is estimated as follows
```r
fit <- estimate(Y, type = "mixed")
```
`type` can be `continuous`, `binary`, `ordinal`, or `mixed`. Note that `type` is a misnomer, as the data can consist of *only* ordinal variables (for example).
### Summarize Relations
The estimated relations are summarized with
```r
summary(fit)
#> BGGM: Bayesian Gaussian Graphical Models
#> ---
#> Type: mixed
#> Analytic: FALSE
#> Formula:
#> Posterior Samples: 5000
#> Observations (n): 464
#> Nodes (p): 7
#> Relations: 21
#> ---
#> Call:
#> estimate(Y = Y, type = "mixed")
#> ---
#> Estimates:
#> Relation Post.mean Post.sd Cred.lb Cred.ub
#> INC--DEG 0.463 0.042 0.377 0.544
#> INC--CHI 0.148 0.053 0.047 0.251
#> DEG--CHI -0.133 0.058 -0.244 -0.018
#> INC--PIN 0.087 0.054 -0.019 0.196
#> DEG--PIN -0.050 0.058 -0.165 0.062
#> CHI--PIN -0.045 0.057 -0.155 0.067
#> INC--PDE 0.061 0.057 -0.050 0.175
#> DEG--PDE 0.326 0.056 0.221 0.438
#> CHI--PDE -0.043 0.062 -0.162 0.078
#> PIN--PDE 0.345 0.059 0.239 0.468
#> INC--PCH 0.052 0.052 -0.052 0.150
#> DEG--PCH -0.121 0.056 -0.228 -0.012
#> CHI--PCH 0.113 0.056 0.007 0.224
#> PIN--PCH -0.080 0.059 -0.185 0.052
#> PDE--PCH -0.200 0.058 -0.305 -0.082
#> INC--AGE 0.211 0.050 0.107 0.306
#> DEG--AGE 0.046 0.055 -0.061 0.156
#> CHI--AGE 0.522 0.039 0.442 0.594
#> PIN--AGE -0.020 0.054 -0.122 0.085
#> PDE--AGE -0.141 0.057 -0.251 -0.030
#> PCH--AGE -0.033 0.051 -0.132 0.063
#> ---
```
The summary can also be plotted
```r
plot(summary(fit))
```
![](man/figures/index_summ.png)
### Graph Selection
The graph is selected and plotted with
```r
E <- select(fit)
plot(E, node_size = 12,
edge_magnify = 5)
```
![](man/figures/netplot_index.png)
The Bayes factor testing approach is readily implemented by changing `estimate` to `explore`.
## <i class="fas fa-pen-square"></i> References