generated from rstudio/bookdown-demo
-
Notifications
You must be signed in to change notification settings - Fork 0
/
06_pathdiagrams.Rmd
229 lines (159 loc) · 10.1 KB
/
06_pathdiagrams.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# (PART\*) Path Diagrams {-}
# Introduction to Path Diagrams
```{r include = FALSE}
source("admin/common.R")
```
Path diagrams are a graphical representation of a structural equation model. They are a useful tool for understanding the relationships between variables in a model, as well as a way to communicate the model to others.
In this chapter, we will learn how to create path diagrams using dot notation, as well as explore several software tools available for creating path diagrams, such as `DiagrammeR`, `OpenMx`, and `umx` in R.
## Constructing a Simple Path Diagram
To better understand path diagrams, let's manually construct a simple example. Consider a model where one latent variable influences two observed variables. We can represent this model using the following dot notation:
### Understanding the Components
- **Nodes:** Represent variables, which can be observed or latent. Nodes are usually depicted as circles (latent variables) or squares (observed variables).
- **Edges:** Represent the causal relationships or correlations between variables. An arrow from one node to another indicates a directional relationship, whereas a two-headed arrow indicates a correlation.
### Step-by-Step Construction
1. **Define the Nodes**: Start by defining your nodes, which represent the variables. Here, we have one latent variable and two observed variables.
2. **Draw the Edges**: Next, draw edges to represent the relationships. In this case, the latent variable influences both observed variables.
#### Example Diagram
Let's put these components together using the `DiagrammeR` package to:
```{r simple-path-diagram}
library(DiagrammeR)
grViz("
digraph simple_model {
node [fontname = Ariel, fontsize = 10]
Latent [shape = circle, label = 'Latent Variable L']
Obs1 [shape = box, label = 'Observed Variable 1']
Obs2 [shape = box, label = 'Observed Variable 2']
Latent -> Obs1
Latent -> Obs2
}
")
```
This script uses the `DiagrammeR` package to manually create a path diagram where 'Latent Variable L' influences 'Observed Variable 1' and 'Observed Variable 2'. The arrows indicate the direction of influence from the latent to the observed variables.
### ACE Model Example
Here is a more complex example of a path diagram for a univariate ACE model using dot notation and the `DiagrammeR` package in R.
```{r}
library(DiagrammeR)
grViz('digraph "Univariate ACE Model" {
node [style=filled, fontname="Arial", fontsize=16];
/* Observed Trait */
Trait [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5, label="Trait"];
/* Latent Variables */
A [shape=circle, fillcolor="#f4fd78", label="A"];
C [shape=circle, fillcolor="#f4fd78", label="C"];
E [shape=circle, fillcolor="#f4fd78", label="E"];
/* Paths from Latent Variables to Observed Trait */
A -> Trait [dir=forward];
C -> Trait [dir=forward];
E -> Trait [dir=forward];
/* Variance Paths for Latent Variables */
A -> A [dir=both, headport=n, tailport=n];
C -> C [dir=both, headport=n, tailport=n];
E -> E [dir=both, headport=n, tailport=n];
}'
)
```
Below is an explanation of the code snippet:
Latent Variables:
- `A [shape=circle, fillcolor="#f4fd78", label="A"];`: Defines the latent variable A (Additive genetic factors) with a circular shape, yellow color (#f4fd78), and the label "A".
- `C [shape=circle, fillcolor="#f4fd78", label="C"];`: Defines the latent variable C (Common/shared environmental factors) with a circular shape, yellow color, and the label "C".
- `E [shape=circle, fillcolor="#f4fd78", label="E"];`: Defines the latent variable E (Unique environmental factors) with a circular shape, yellow color, and the label "E".
Paths from Latent Variables to Observed Trait:
- `A -> Trait [dir=forward];`: Creates a forward directional path from the latent variable A to the observed trait.
- `C -> Trait [dir=forward];`: Creates a forward directional path from the latent variable C to the observed trait.
- `E -> Trait [dir=forward];`: Creates a forward directional path from the latent variable E to the observed trait.
Variance Paths for Latent Variables:
- `A -> A [dir=both, headport=n, tailport=n];`: Represents the variance of the latent variable A with a bidirectional path.
- `C -> C [dir=both, headport=n, tailport=n];`: Represents the variance of the latent variable C with a bidirectional path.
- `E -> E [dir=both, headport=n, tailport=n];`: Represents the variance of the latent variable E with a bidirectional path.
The resulting path diagram visualizes the relationships between the latent variables (A, C, E) and the observed trait in an ACE model.
## Pre-existing software
There are several existing software tools that can be used to create path diagrams for specific models, such as the `umx` and `OpenMx` packages in R. These tools provide a user-friendly interface for creating and visualizing path diagrams, making it easier to understand and communicate complex models.
### Creating a Path Diagram with `omxGraphviz`
```{r}
library(OpenMx)
data(demoOneFactor)
manifests <- names(demoOneFactor)
latents <- c("G1")
model1 <- mxModel("One Factor", type="RAM",
manifestVars = manifests,
latentVars = latents,
mxPath(from=latents, to=manifests),
mxPath(from=manifests, arrows=2),
mxPath(from=latents, arrows=2, free=F, values=1.0),
mxData(cov(demoOneFactor), type="cov",numObs=500)
)
omxGraphviz(model1, "one-factor-generated.dot")
```
The following code snippet creates a path diagram for a one-factor model using the OpenMx package. The model includes one latent variable (G1) and three manifest variables (x1, x2, x3). The `mxModel` function is used to define the model, and the `mxPath` function is used to specify the paths between variables. The `mxData` function is used to specify the data for the model. Finally, the `omxGraphviz` function is used to generate a graphical representation of the model in the form of a dot file.
I have annotated the dot file to explain the purpose of each function and argument. The resulting dot file can be visualized using graph visualization tools like Graphviz or packages like `DiagrammeR`.
```
digraph "One Factor" {
// Setting the style and font for all nodes
node [style=filled, fontname="Arial", fontsize=16];
/* Manifest Variables */
// Grouping all manifest variables (x1 to x5) to be on the same rank (horizontal alignment)
{ rank = max; x1; x2; x3; x4; x5 }
// Defining each manifest variable with square shape, color, and size
x1 [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5];
x2 [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5];
x3 [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5];
x4 [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5];
x5 [shape=square, fillcolor="#a9fab1", height=0.5, width=0.5];
/* Latent Variables */
// Defining the latent variable G1 with a circular shape and color
G1 [shape=circle, fillcolor="#f4fd78"];
/* Paths */
// Defining directional paths from the latent variable G1 to each manifest variable
G1 -> x1[dir=forward];
G1 -> x2[dir=forward];
G1 -> x3[dir=forward];
G1 -> x4[dir=forward];
G1 -> x5[dir=forward];
// Defining bidirectional paths for each manifest variable to represent error terms
x1 -> x1[dir=both, headport=s, tailport=s];
x2 -> x2[dir=both, headport=s, tailport=s];
x3 -> x3[dir=both, headport=s, tailport=s];
x4 -> x4[dir=both, headport=s, tailport=s];
x5 -> x5[dir=both, headport=s, tailport=s];
// Defining a bidirectional path for the latent variable G1 to represent its variance
G1 -> G1[dir=both, headport=n, tailport=n];
}
```
We can load the dot file into R and visualize the path diagram using the `DiagrammeR` package.
```{r}
library(DiagrammeR)
grViz("one-factor-generated.dot")
```
The resulting path diagram shows the relationships between the latent variable G1 and the manifest variables x1, x2, x3, x4, and x5. The directional paths from G1 to the manifest variables represent the factor loadings, while the bidirectional paths for each manifest variable represent the error terms. The bidirectional path for the latent variable G1 represents its variance.
### Creating a Path Diagram for an ACE Model Using the `umx` Package
The `umx` package provides a user-friendly interface for specifying and estimating structural equation models, including path diagrams for classical models like the ACE model in twin studies. The `umx` package simplifies the process of specifying and estimating complex structural equation models like the ACE model. Below is an example of how to create a path diagram for an ACE model using the `umx` package in R. (Many thanks to Tim Bates for providing the source code).
```{r}
library(umx)
library(tidyverse)
# Thanks to Tim Bates for making some nice tidy Source Code
# =====================
# = Make an ACE model =
# =====================
# 1. Clean data: Add separator and scale
data(twinData)
tmp <- umx_make_twin_data_nice(data=twinData,
sep="", zygosity="zygosity",
numbering=1:2) %>%
umx_scale_wide_twin_data(varsToScale= c("wt", "ht"),
sep= "_T",
data= .)
mzData <- subset(tmp, zygosity %in%
c("MZFF", "MZMM"))
dzData <- subset(tmp, zygosity %in%
c("DZFF", "DZMM"))
# 2. Define paths: You only need the paths for one person:
paths <- c(
umxPath(v1m0 = c("a1", 'c1', "e1")),
umxPath(means = c("wt")),
umxPath(c("a1", 'c1', "e1"), to = "wt", values=.2)
)
m1 <- umxTwinMaker("test", paths, mzData = mzData, dzData = dzData)
plot(m1, std= TRUE, means= FALSE)
```
The resulting path diagram shows the relationships between the latent variables A, C, and E and the manifest variable `wt`. The paths between the latent variables and manifest variables represent the factor loadings, while the bidirectional paths for each manifest variable represent the error terms. The bidirectional path for each latent variable represents its variance.
This diagram only shows the paths for one twin, but the model is estimated using data from multiple twin pairs to estimate the genetic, shared environmental, and non-shared environmental influences on the trait.