-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathselection-control.qmd
309 lines (238 loc) · 9.61 KB
/
selection-control.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
# Conditional Statements for Statistical Analysis
Within this chapter, we'll focus on different statistical use cases for selection
control. Selection control refers to a program control structure that allows for
conditional execution of code. We'll cover the following topics:
- if statements;
- if-else statements;
- if-elseif-else statements;
- vectorized ifelse statements; and,
- switch statements in R.
We'll explore how to use these conditional statements for tasks like applying classifications based on thresholds and discretizing continuous values into bins.
## if Statements
We make choices based on conditions every day. For example:
- If the alarm rings, we get up
- If a plant looks dry, we water it
- If it's raining, we take an umbrella
These are **conditional decisions**: we act only when a specific condition is met.
In R, we use `if` statements to make these kinds of decisions. This structure
tells R to perform an action only when the condition is true. The basic
structure of an `if` statement is given in @lst-if-bare-bones.
:::{.columns}
:::{.column}
```{r}
#| eval: false
#| lst-label: lst-if-bare-bones
#| lst-cap: if Statement Structure
if (condition) {
# Action if condition is true
}
```
:::
:::{.column}
```{mermaid}
%%| echo: false
%%| label: fig-if-bare-bones
%%| fig-cap: Visual flow of an if statement.
graph TD
A([Start]) --> B{Condition}
B -->|"<condition> is FALSE"| E([End])
B -->|"<condition> is TRUE"| C[Execute if block]
C --> E
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#bbf,stroke:#333,stroke-width:2px
style C fill:#dfd,stroke:#333,stroke-width:2px
style E fill:#f9f,stroke:#333,stroke-width:2px
```
:::
:::
### Control Flow Diagrams
We can visualize the flow of an `if` statement using a control flow diagram.
Alongside the `if` statement structure in @lst-if-bare-bones, we also included
a control flow diagram in @fig-if-bare-bones. By following the arrows in the
diagram from the start to the end of the process we can see a step-by-step
breakdown of the process involved in executing an `if` statement.
The control flow diagram for the `if` statement consists of five steps:
1. **Start**: Begin the process
2. **Condition**: Check if the condition is true or false
3. **If true**: Execute the code in the `if` block
4. **If false**: Skip the `if` block and continue to the end
5. **End**: Finish the process
For each step, the diagram uses color-coding and shapes for clarity. We've
created a key in @fig-flow-chart-key to explain the symbols used in the flow chart.
```{mermaid}
%%| label: fig-flow-chart-key
%%| fig-cap: Symbols used in constructing the flow chart diagram for control flow.
%%| echo: false
graph LR
A([Start/End]) --- A1([Oval: Indicates the start or end of a process])
style A fill:#f9f,stroke:#333,stroke-width:2px
style A1 fill:#fff,stroke:#fff,stroke-width:0px
B{Decision} --- B1[Diamond: Represents a decision or branching point]
style B fill:#bbf,stroke:#333,stroke-width:2px
style B1 fill:#fff,stroke:#fff,stroke-width:0px
C[Process] --- C1[Rectangle: Represents a process or action step]
style C fill:#dfd,stroke:#333,stroke-width:2px
style C1 fill:#fff,stroke:#fff,stroke-width:0px
D((Connector)) --- D1[Circle: Used as a connector between flow lines]
style D fill:#fff,stroke:#333,stroke-width:2px
style D1 fill:#fff,stroke:#fff,stroke-width:0px
E[/Input/Output/] --- E1[Parallelogram: Represents input or output operations]
style E fill:#ffd,stroke:#333,stroke-width:2px
style E1 fill:#fff,stroke:#fff,stroke-width:0px
linkStyle 0 stroke-width:0px
linkStyle 1 stroke-width:0px
linkStyle 2 stroke-width:0px
linkStyle 3 stroke-width:0px
linkStyle 4 stroke-width:0px
```
We'll frequently use these diagrams to illustrate the flow of control structures.
### Example: Classifying a Value Based on a Threshold
Let's consider an example of threshold-based classification, a fundamental
concept in statistics often used for decision-making. In this scenario, we
evaluate a value against a predetermined threshold to categorize it accordingly.
Usually, we will estimate the value of a variable and compare it to a pre-set
threshold.
For instance, if we have a test score and want to determine if it meets the
passing threshold, we can use an `if` statement. Suppose we have a score of 75
and a passing threshold of 80, we can use the following code:
```r
value <- 75
threshold <- 80
if (value >= threshold) {
print("Pass")
}
```
```{mermaid}
%%| label: fig-if-threshold-example
%%| fig-cap: Visual representation of the if statement with a threshold.
%%| echo: false
graph LR
A([Start]) --> B["value = 75<br>threshold = 80"]
B --> C{"value ≥ threshold"}
C -->|TRUE| D["Print 'Pass'"]
C -->|FALSE| F([End])
D --> F
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#dfd,stroke:#333,stroke-width:2px
style C fill:#bbf,stroke:#333,stroke-width:2px
style F fill:#f9f,stroke:#333,stroke-width:2px
```
In this example, the program checks if the score **meets or exceeds** the threshold.
If it does, it prints `"Pass"`. In this case, since 75 is less than 80, the condition `value >= threshold` is `FALSE`. Therefore, the code inside the if block (printing `"Pass"`)
is not executed.
In the next section, we'll explore the `if-else` statement, which allows us to
specify an alternative action if the condition is not met.
## if-else Statements
In real life, we often have backup plans when making decisions. For example:
- if it's raining, take an umbrella; otherwise, bring sunglasses.
- if the traffic is heavy, take a different route; otherwise, stay on the same route.
- if you're hungry, eat a snack; otherwise, wait for the next meal.
These scenarios illustrate conditional decisions with both primary and secondary
actions. In programming, we use similar logic to control program flow. The
`if-else` statement in R allows us to execute one block of code if a condition
is true and another _different_ block of code if it's false. The structure of
an `if-else` statement and its control flow diagram are shown in @lst-if-else-bare-bones
and @fig-if-else-bare-bones, respectively.
:::{.columns}
:::{.column}
```{r}
#| eval: false
#| lst-label: lst-if-else-bare-bones
#| lst-cap: if-else Statement Structure
if (condition) {
# code to execute if condition is TRUE
} else {
# code to execute if condition is FALSE
}
```
:::
:::{.column}
```{mermaid}
%%| echo: false
%%| label: fig-if-else-bare-bones
%%| fig-cap: Visual representation of the if-else statement.
graph TD
A([Start]) --> B{Condition}
B -->|"<condition> is TRUE"| C[Execute if block]
B -->|"<condition> is FALSE"| D[Execute else block]
C --> E([End])
D --> E
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#bbf,stroke:#333,stroke-width:2px
style C fill:#dfd,stroke:#333,stroke-width:2px
style D fill:#dfd,stroke:#333,stroke-width:2px
style E fill:#f9f,stroke:#333,stroke-width:2px
```
:::
:::
### Example: Revisiting Classifying a value based on a threshold
Let's revisit the example of classifying a value based on a threshold. In this
case, we'll use an `if-else` statement to provide outcomes for both scenarios.
Just as before, we'll print `"Pass"` if the value is greater than or equal to
the threshold. With the addition of the `else` clause, we'll print `"Fail"` if
the value is less than the threshold.
```{r}
value <- 75
threshold <- 80
if (value >= threshold) {
print("Pass")
} else {
print("Fail")
}
```
```{mermaid}
%%| label: fig-if-else-threshold-example
%%| fig-cap: Visual representation of the if-else statement with a threshold.
%%| echo: false
graph LR
A([Start]) --> B["`value = 75
threshold = 80`"]
B --> C{"value ≥ threshold"}
C -->|TRUE| D["Print 'Pass'"]
C -->|FALSE| E["Print 'Fail'"]
D --> F([End])
E --> F
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#dfd,stroke:#333,stroke-width:2px
style C fill:#bbf,stroke:#333,stroke-width:2px
style F fill:#f9f,stroke:#333,stroke-width:2px
```
### Example: Significance Testing
In statistical analysis, we often use p-values to determine the significance of
results. A $p$-value is a measure of the strength of evidence against the null
hypothesis. If the $p$-value is less than a pre-set significance level (usually
$\alpha = 0.05$), we reject the null hypothesis. We can use an `if-else` statement to
determine the significance of a $p$-value.
In the following example, we'll check if a p-value is less than the significance
level of 0.05. If it is, we'll print `"Reject the null hypothesis."`;
otherwise, we'll print `"Do not reject the null hypothesis"`.
```{r}
# Sample p-value from a statistical test
p_value <- 0.03 # Example p-value
# Significance level
alpha <- 0.05
# Check the significance of the p-value
if (p_value < alpha) {
print("Reject the null hypothesis.")
} else {
print("Do not reject the null hypothesis.")
}
```
```{mermaid}
%%| label: fig-if-else-p-value-example
%%| fig-cap: Visual representation of the if-else statement applied to significance testing
%%| echo: false
graph LR
A([Start]) --> B["`*p*-value = 0.03
alpha = 0.05`"]
B --> C{"value ≥ threshold"}
C -->|TRUE| D["Print 'Reject the null hypothesis'"]
C -->|FALSE| E["Print 'Do not reject the null hypothesis.'"]
D --> F([End])
E --> F
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#dfd,stroke:#333,stroke-width:2px
style C fill:#bbf,stroke:#333,stroke-width:2px
style F fill:#f9f,stroke:#333,stroke-width:2px
style E fill:#dfd,stroke:#333,stroke-width:2px
```