-
Notifications
You must be signed in to change notification settings - Fork 0
/
ENVS2018 kioloa results markdown 2024.Rmd
1062 lines (821 loc) · 47.2 KB
/
ENVS2018 kioloa results markdown 2024.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "ENVS2018 Environmental Science Field School: Kioloa analyses"
author: "Shoshana Rapley, Saul Cunningham"
date: "17 September 2024"
output:
html_document:
toc: true
number_sections: true
toc_depth: 3
toc_float:
collapsed: true
theme: cerulean
highlight: pygments
code_folding: hide
editor_options:
chunk_output_type: console
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_file = file.path(dirname(inputFile), 'ENVS2018 Kioloa Report Results.html')) })
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
```
# Background
The purpose of the Kioloa week report is to report on the patterns of biodiversity we observed. In particular, you should examine the composition and structure of different plant communities at Kioloa, whether animal (bird) communities also differ, and consider how these patterns relate to the underlying soils and geology.
We are excited to share the whole-class survey results with you all, because there are some really interesting patterns here. Keep in mind when writing the reports that we want you to use this data summary as source material. We are not expecting that you do further, more complex analyses. You can focus on the summary data that we have provided. You may also use the figures/graphs from this markdown in your report.
Also you are not required to report on every piece of data. Given that we have more than 100 species in the list and 14 habitat measures, this would dilute the message of the report. Keep your focus on the data that show patterns of interest. Try to pick a narrative from these data that you find interesting and that you can explain within the constraints of the word count.
# Markdown
This document is called a 'markdown'. It's a way to share code, annotations and results all in one place.
If you are interested in the code you can unfold the chunks by clicking the 'Show' icon. There is also an option on the top right of the document called 'Code' that allows you to show or hide all code chunks. Notes in the code itself that start with a hash explain what the next line of code achieves.
```{r}
# Install and load required packages
pacman::p_load(dplyr, gdata, ggplot2, ggrepel, gt, gtExtras, janitor, lme4, lubridate,
rstudioapi, readxl, sjmisc, tidyr, tidyverse, vegan, viridis)
```
# Vegetation communities
We collected vegetation data using the biometric tool methodology. This included compositional elements (e.g. overstory trees) and functional elements (e.g. woody debris).
## Floral composition
Let's start by looking at a matrix of the floral composition.
Before analysing the data we made a few corrections to what we believe were data entry errors (details can be provided on request). We then organised the data in a way that allowed us to identify the some different plant communities.
Here is how to read the table: Each row is a plant species and each column is a site. In the cells we have marked o (overstorey) m (mid-storey) and g (ground layer) to show where the species occurs.
The vertical bands of colour should help you to see the six plant communities. Species that help define particular communities are highlighted with blocks of colour. The ordering from top to bottom is arranged to make the associations clearer (to arrange the blocks) so it is not perfectly arranged into overstory, mid-storey understorey. The functional group in the first column helps you see what kind of plant it is, and the “observation” in the last column explains how the species relate to the plant communities that we identified.
When you report on the analysis remember to look at where these sites occur in relation to each other and how this relates to the underlying soil, the topography and the distance from the ocean. Also examine how the species composition relates to the structural characteristics that are described by the biometric survey.
```{r}
# read in the data
comp_data <- read_excel("data/ENVS2018 kioloa flora composition 2024.xlsx", skip = 1)
# create a pretty table using the package 'gt'
comp_table <- gt(comp_data) %>%
tab_options(data_row.padding = px(1)) %>%
# add header
tab_header(title = "Floral composition",
subtitle = "ENVS2018 Student data Kioloa 2024") %>%
# replace NAs with blanks
tab_style(cell_text(style = "italic"),locations = cells_body(columns = 2)) %>%
sub_missing(missing_text = " ") %>%
# add column colours for community type (using a 6 colour viridis scale with transparency)
tab_style(style = list(cell_fill(color = "#44015433")),
locations = cells_body(columns = 3:5)) %>%
tab_style(style = list(cell_fill(color = "#41448733")),
locations = cells_body(columns = 6:8)) %>%
tab_style(style = list(cell_fill(color = "#2A788E33")),
locations = cells_body(columns = 9:12)) %>%
tab_style(style = list(cell_fill(color = "#22A88433")),
locations = cells_body(columns = 13:18)) %>%
tab_style(style = list(cell_fill(color = "#7AD15133")),
locations = cells_body(columns = 19:22)) %>%
tab_style(style = list(cell_fill(color = "#FDE72533")),
locations = cells_body(columns = 23:26)) %>%
# add row/column colours for clusters (using 6 colour viridis scale)
tab_style(style = list(cell_fill(color = "#440154FF"), cell_text(color = "white")),
locations = cells_body(columns = 3:5, rows = c(1, 44:49))) %>%
tab_style(style = list(cell_fill(color = "#414487FF"), cell_text(color = "white")),
locations = cells_body(columns = 6:8, rows = c(2:4, 26, 38:43))) %>%
tab_style(style = list(cell_fill(color = "#2A788EFF"), cell_text(color = "white")),
locations = cells_body(columns = 9:12, rows = 50:54)) %>%
tab_style(style = list(cell_fill(color = "#22A884FF"), cell_text(color = "white")),
locations = cells_body(columns = 13:18, rows = 3:14)) %>%
tab_style(style = list(cell_fill(color = "#7AD151FF"), cell_text(color = "white")),
locations = cells_body(columns = 19:22, rows = c(3:4, 15:30))) %>%
tab_style(style = list(cell_fill(color = "#FDE725FF"), cell_text(color = "black")),
locations = cells_body(columns = 23:26, rows = c(3:4, 32:36))) %>%
# add new community grouping names
tab_spanner(label = "Casuarina forest", columns = 3:5) %>%
tab_spanner(label = "Coastal Eucalypt forest", columns = 6:8) %>%
tab_spanner(label = "Grassland/paddock", columns = 9:12) %>%
tab_spanner(label = "Lower slope forest", columns = 13:18) %>%
tab_spanner(label = "Moist gully rainforest", columns = 19:22) %>%
tab_spanner(label = "Upper slope forest", columns = 23:26) %>%
# change tally colour
tab_style(style = list(cell_fill(color = "lightgrey")),
locations = cells_body(columns = 27)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
tab_style(style = cell_text(size = px(11)), locations = cells_column_spanners())
comp_table
```
**CAS: Casuarina forest**
This community has a *Casuarina glauca* overstory. It is without any other canopy tree species and is missing most of the shrubs that are common elsewhere (e.g. *Acacia* species). There are also some shrub and forb species found here that were not seen in other sites.
**CEF: Coastal Eucalypt forest**
This is the only community where *Eucalyptus botryoides* is common in the overstory (though *Corymbia maculata* and *Eucalytptus pilularis* occur also). The mid-storey is characterised by *Acmena smithii* and a mix of other species including *Acacia* species. The ground layer has abundant sedges.
**P: Grassland/paddock**
This community is characterised by the absence of overstorey and mid-storey. The ground layer has grasses, sedges, rushes and some forbs. The species that were abundant here and not elsewhere were exotic forbs.
**LSF: Lower slope forest**
This community has a canopy of *Corymbia maculata*, *Eucalyptus pilularis* and in one site also *Eucalyptus tereticornis*. The mid-storey has many shrubs and small trees including *Acacia* species. *Acacia maidenii* was found in this community but not elsewhere. There are a range of mid and ground layer species that are characteristic of rainforest and upper slope forest sites that are missing from this forest type (such as *Acmena smithii*, *Livistona australis*, and the rainforest-y species).
**RF: Moist gully rainforest**
This community has an emergent overstorey of *Corymbia maculata* and *Eucalyptus pilularis* (like much of the forest in the area) but it also has a number of shorter overstorey and mid-storey species not common in the other communities. The ground layer has climbers (*Hibbertia* species, *Kennedia rubicunda*, *Cissus*, *Cynanchum elegans*) and ferns. There are also a number of species found here that also occur in upper slope forest such as *Acmena smithii*, *Livistona australis*, *Prostanthera sp.*.
**USF: Upper slope forest**
This community has an overstory of *Corymbia maculata* and *Eucalyptus pilularis* and a well-developed mid-storey that includes many shrubs and short trees that are also found in the rainforest. There are also a few species that occur here but not in the other communities.
## Biometric values
Now that we have our floristic community groupings, let's take a look at the biometric data you collected. We will highlight cells for a few key variables based on their values to make it easier to see what's going on.
```{r}
# read in data
biometric_data <- read_excel("data/ENVS2018 kioloa biometric 2024.xlsx") %>%
# drop team column
select(-1)
# make table in gt
biometric_table <- gt(biometric_data) %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
# add header
tab_header(title = "Biometric data",
subtitle = "ENVS2018 Student data Kioloa 2023") %>%
# format percentages to display %
fmt_percent(columns = 3:13, decimals = 1) %>%
# add data grouping labels
tab_spanner(label = "Percent cover (transects)", columns = 3:10) %>%
tab_spanner(label = "Tree cover", columns = 11:13) %>%
tab_spanner(label = "Feature counts", columns = 14:16) %>%
# add row colours for community type (using a 7 colour viridis scale with transparency)
tab_style(style = list(cell_fill(color = "#44015433")),
locations = cells_body(rows = 1:3, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#41448733")),
locations = cells_body(rows = 4:6, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#2A788E33")),
locations = cells_body(rows = 7:10, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#22A88433")),
locations = cells_body(rows = 11:16, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#7AD15133")),
locations = cells_body(rows = 17:20, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#FDE72533")),
locations = cells_body(rows = 21:24, columns = 1:2)) %>%
# colour columns according to values
# Grass > 40%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Grass, rows = Grass > .4)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Grass, rows = Grass < .4)) %>%
# Leaf litter > 30%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Litter, rows = Litter > .3)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Litter, rows = Litter < .3)) %>%
# Midstory > 15%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Midstory, rows = Midstory > .15)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Midstory, rows = Midstory < .15)) %>%
# Midstory + over > 60%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Cover, rows = Cover > .6)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Cover, rows = Cover < .6)) %>%
# CWD length > 100m
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = CWD, rows = CWD > 100)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = CWD, rows = CWD < 100)) %>%
# add footnotes for the above
tab_footnote(footnote = "Highlighted grass cover > 40%",
locations = cells_column_labels(columns = Grass)) %>%
tab_footnote(footnote = "Highlighted leaf litter > 30%",
locations = cells_column_labels(columns = Litter)) %>%
tab_footnote(footnote = "Highlighted midstory cover > 15%",
locations = cells_column_labels(columns = Midstory)) %>%
tab_footnote(footnote = "Highlighted combined mid and overstory cover > 60%",
locations = cells_column_labels(columns = Cover)) %>%
tab_footnote(footnote = "Highlighted coarse woody debris where the total length > 100m",
locations = cells_column_labels(columns = CWD))
biometric_table
```
## Visualisation
That's given us a good overview of how the biometric features vary by community type. Let's now summarise and visualise the data as boxplots. A boxplot shows the mean (middle line of the box), quartiles (the box and whiskers) and outliers (dots) and are useful for comparing quantitative data among groups.
```{r}
biometric_data2 <- biometric_data %>%
# clean names
clean_names() %>%
# convert percentages to 0-100
mutate(across(3:13, ~ .x * 100)) %>%
# order the factors (by appearance) to match the table colour palette
mutate(community = as_factor(community))
# break into ground cover data and convert to long format for graphing
cover <- select(biometric_data2, -c(11:16)) %>%
pivot_longer(cols = 3:10, names_to = "feature", values_to = "percent") %>%
mutate(feature = as_factor(feature))
# create box plot for cover
ggplot(cover) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank())
```
Good. We can see there are generally low values for cryptogam, rock and bare ground, so let's remove those to make the others easier to read.
```{r}
cover2 <- cover %>%
filter(feature %in% c("cover", "forb", "grass", "litter", "shrub"))
# create box plot for cover
ggplot(cover2) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank())
```
Interesting!
Now let's look at the mid and overstory cover, plus the combined mid and overstory (called in the plot 'cover'). The combined mid and overstory metric gives us an indication of how much light will get through to the floor (which in turn affects which plants can grow there).
```{r}
# break into tree cover data and convert to long format for graphing
tree <- select(biometric_data2, -c(3:10, 14:16)) %>%
pivot_longer(cols = 3:5, names_to = "feature", values_to = "percent") %>%
mutate(feature = as_factor(feature))
# create box plot for cover
ggplot(tree) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank())
```
Now let's look at the count biometric data: trees, trees with hollows and length of CWD.
```{r}
# break into cover data and convert to long format for graphing
count <- select(biometric_data2, -c(3:13)) %>%
pivot_longer(cols = 3:5, names_to = "feature", values_to = "count") %>%
mutate(feature = as_factor(feature))
# create box plot for count
ggplot(count) +
geom_boxplot(aes(community, count, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank())
```
Great! You can use these graphs in your report.
## Summary Statistics
And we'll calculate the following summary statistics for all features: mean and standard deviation. You can use these numbers in the report.
```{r}
summary_stats <- biometric_data %>% clean_names() %>%
pivot_longer(cols = 3:16, names_to = "feature", values_to = "value") %>%
group_by(community, feature) %>%
summarise(mean = mean(value),
sd = sd(value)) %>%
pivot_wider(names_from = 2, values_from = 3:4) %>%
sjmisc::rotate_df(cn=TRUE) %>% slice(-1) %>%
rownames_to_column() %>%
rename("Stat" = 1) %>%
separate_wider_delim(Stat, "_", names = c("Stat", "Variable"),
too_many = "merge") %>%
arrange(Variable) %>%
mutate_at(c(3:8), as.numeric)
summary_stats_table <- gt(summary_stats) %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
# add header
tab_header(title = "Biometric data summary statistics",
subtitle = "ENVS2018 Student data Kioloa 2023") %>%
# drop trailing zeros
fmt_number(drop_trailing_zeros = TRUE) %>%
# format percentages to display %
fmt_percent(rows = c(1,3,5,9,11,15,17,19,21,23,25), decimals = 1) %>%
# add column colours
tab_style(style = cell_fill(color = "#44015433"),
locations = cells_body(columns = 3)) %>%
tab_style(style = cell_fill(color = "#41448733"),
locations = cells_body(columns = 4)) %>%
tab_style(style = cell_fill(color = "#2A788E33"),
locations = cells_body(columns = 6)) %>%
tab_style(style = cell_fill(color = "#22A88433"),
locations = cells_body(columns = 5)) %>%
tab_style(style = cell_fill(color = "#7AD15133"),
locations = cells_body(columns = 7)) %>%
tab_style(style = cell_fill(color = "#FDE72533"),
locations = cells_body(columns = 8)) %>%
# arrange columsn per other tables
cols_move(columns = 5, after = 6)
summary_stats_table
```
# Bird surveys
We conducted expert bird surveys of all the plots. The observer (Sho, Julian or Belinda) recorded all birds seen and heard in a five-minute period in the following distance categories: <25m, 25-50m, 50-100m, >100m and overhead. We also asked you to estimate how many species you could identify during the survey.
## Site visits
Let's start by visualising and summarising the data to get our heads around it. Each site was surveyed twice, so let's look at each visit for each site.
```{r}
# read in the data and clean names
birds_expert <- read_excel("data/ENVS2018 kioloa bird data 2024.xlsx", sheet = 1) %>%
clean_names() %>%
# replace NAs with zeros
replace(is.na(.), 0) %>%
# sum counts by species for inside (0-50m) and outside (50+m) the site, and total
rowwise() %>%
mutate(count_in = sum(x25 + x25_50),
count_out = sum(x50_100 + x100 + overhead),
count_total = sum(count_in + count_out)) %>%
select(!c(8:12)) %>%
mutate(present_in = ifelse(count_in>0, 1, 0),
present_out = ifelse(count_out>0, 1, 0),
present_total = ifelse(count_total>0, 1, 0))
# summarise by site for richness
birds_richness <- birds_expert %>% group_by(site, visit) %>%
mutate(visit = as.factor(visit)) %>%
summarise(richness_in = sum(present_in>0),
richness_out = sum(present_out>0),
richness_total = sum(present_total>0)) %>%
arrange(site)
# plot site visits
ggplot(birds_richness)+
geom_point(aes(factor(site), richness_total, colour = visit)) +
geom_line(aes(factor(site), richness_total)) +
theme_minimal() +
theme(axis.text.x = element_text(angle=45)) +
xlab("site")
# calculate differences
birds_alpha <- birds_richness %>% group_by(site) %>%
mutate(diff = abs(richness_total - lag(richness_total))) %>%
na.omit()
alpha_diff <- data.frame(mean = mean(birds_alpha$diff),
min = min(birds_alpha$diff),
max = max(birds_alpha$diff),
stdev = sd(birds_alpha$diff))
print(alpha_diff)
```
The difference between visits was on average 4 bird species.
How about the difference in diversity detected between each visit? I.e. the beta diversity.
```{r}
birds_beta <- data.frame()
sites <- unique(birds_expert$site)
for (i in 1:24) {
visit1 <- filter(birds_expert, site == sites[i] & visit == 1)
visit2 <- filter(birds_expert, site == sites[i] & visit == 2)
diff_a <- setdiff(visit1$common_name, visit2$common_name)
diff_b <- setdiff(visit2$common_name, visit1$common_name)
out <- data.frame(site = sites[i], beta = length(diff_a) + length(diff_b))
birds_beta <- rbind(birds_beta, out)
}
beta_diff <- data.frame(mean = mean(birds_beta$beta),
min = min(birds_beta$beta),
max = max(birds_beta$beta),
stdev = sd(birds_beta$beta))
differences <- rbind(alpha_diff, beta_diff) %>%
add_column(type = c("alpha", "beta"), .before = "mean")
print(differences)
```
Although the total difference between the visits was small, the beta diversity was greater! There was always differences in the diversity between visits (the minimum is 7) and the mean beta diversity was 14.
This is part of why we do repeat visits to site- to capture the diversity. Visits may differ by time of day (e.g. 6am or 7am) or weather.
Interestingly, this is the same as last year - where the mean difference between observers was also 4, and the mean beta diversity was also 14.
## Species richness
From here on we will use the total richness detected (i.e. total number of unique species across both visits).
Let's summarise the species richness by sites. We will just do this for birds 'in' the site (i.e. within 50m, since then we can compare these data with the vegetation). Let's colour these by floristic community.
```{r}
# filter to birds within 50m
birds_insite <- birds_expert %>%
filter(present_in>0) %>%
group_by(site) %>%
summarise(richness = length(unique(common_name)))
# format for us in the package 'vegan'
birds_vegan <- birds_expert %>%
select(c(4,6,8)) %>%
pivot_wider(names_from = common_name, values_from = count_in, values_fn = sum, values_fill = 0) %>%
column_to_rownames(var = "site")
# create table of sites and communities
communities <- data.frame(site = c(1, 2, 4,
3, 5, 7,
8, 9, 10, 12,
6, 11, 13, 14, 16, 18,
15, 17, 19, 20,
21, 22, 23, 24),
community = c(rep("CAS", 3),
rep("CEF", 3),
rep("P", 4),
rep("LSF", 6),
rep("RF", 4),
rep("USF", 4))) %>%
mutate(community = as_factor(community))
# assign communities to bird data
birds_insite2 <- left_join(communities, birds_insite) %>%
mutate(community = as_factor(community)) %>%
arrange(community)
# set the plots in the right order for plotting
birds_insite2$site <- factor(birds_insite2$site, levels = birds_insite2$site)
# plot by community
ggplot(birds_insite2) +
geom_boxplot(aes(community, richness, fill = community))+
scale_fill_viridis_d()+
theme_minimal() +
theme(axis.text.x = element_blank())
# summary statistics
birds_summary <- birds_insite2 %>% group_by(community) %>%
summarise(mean_richness = mean(richness),
stdev = sd(richness))
print(birds_summary)
# test if there is a difference in richness by community
m1 <- aov(richness ~ community,
data = birds_insite2)
summary (m1)
TukeyHSD(m1)
plot(TukeyHSD(m1), las = 2, cex.axis=0.75)
```
Interesting! The moist gully rainforest had the highest species richness, with an average of 16 species. The paddock (unsuprisingly) had the lowest.
There was a significant difference in richness between the community types. Based on the post-hoc Tukey's test (which tests for pairwise difference) we found that the differences were between rainforest and casuarina, rainforest and paddock, and lower slope forest and paddock. The Tukey test plot shows us the direction and strength of the pairwise differences; e.g. in the 'RF-P' pairing, rainforest has many more species than paddock, and since this crosses the zero, this is a significant difference.
**Please note**, you do not have to use the statistical analyses in your report unless you feel confident doing so. If not, stick to the summary data and interpretation of the tables and figures.
## Beta diversity
But as we know, richness doesn't tell us everything. What about beta diversity between the sites and communities?
```{r}
# select all birds in sites
species <- birds_expert %>%
filter(present_in==1) %>%
# add community
left_join(communities)
# for loop of birds only found in that site
sites_beta <- data.frame()
sites <- unique(birds_expert$site)
for (i in 1:24) {
site_test <- species %>% filter(site == sites[i])
site_other <- species %>% filter(!site == sites[i])
diff_site <- setdiff(site_test$common_name, site_other$common_name)
if(length(diff_site) == 0){
next
}
out <- data.frame(site = sites[i], unique = diff_site)
sites_beta <- rbind(sites_beta, out)
}
print(sites_beta)
```
Cool. Half of the sites had at least one unique bird species found no where else. The sites with two unique birds were site 9 (the paddock, with magpie lark and nankeen kestrel) and site 14 (with eastern rosella and satin bowerbird). Satin bowerbird was recorded at other sites during the bird survey but outside the 50m area. I suspect the satin bowerbird was found inside site 14 because it was foraging on fruiting native cherry *Exocarpos cupressiformis*.
Let's now look at the unique diversity across the floristic communities.
```{r}
# for loop of birds only found in that community
comm_beta <- data.frame()
comm <- as.character(unique(communities$community))
for (i in 1:6) {
comm_test <- species %>% filter(community == comm[i])
comm_other <- species %>% filter(!community == comm[i])
diff_comm <- setdiff(comm_test$common_name, comm_other$common_name)
if(length(diff_comm) == 0){
next
}
out <- data.frame(community = comm[i], unique = diff_comm)
comm_beta <- rbind(comm_beta, out)
}
# summarise unique birds
comm_beta_sum <- comm_beta %>%
mutate(community = as_factor(community)) %>%
group_by(community) %>%
summarise(unique_species = length(unique))
# plot unique birds by community
ggplot(comm_beta_sum)+
geom_point(aes(unique_species, community, colour = community), size = 5) +
scale_colour_viridis_d()+
xlim(0,6)+
theme_minimal()
# print the unique birds
print(comm_beta)
```
Very cool! There were a total of 20 bird species only found in one community type. The lower slope forest and the paddock had the greatest number of unique species (n=5).
We only detected Mistletoebird in the lower slope eucalypt forest. As the name suggests, Mistletoebirds specalise on mistletoe fruit. Mistletoes are hemiparasitic plants that grow on the branches of other plants. There are approximately 100 species of mistletoe in Australia and these often have only one or two host trees. Mistletoes are an important element of eucalypt forests and woodlands because they provide fruit (for frugivorous birds) and, more importantly, they provide pockets of cover that are good for nesting in. Of the Australian birds that construct nests in trees, roughly two-thirds of them will preferentially nest in mistletoe. Mistletoe can have a bad reputation in popular culture because of the misconception that they kill trees. In some cases, mistletoe loads become high and this places additional water stress on the host tree; but the underlying cause of this issue is habitat fragmentation. It is possible that we detected Mistletoebird in site 13 because of the presence of Mistletoe. Interestingly, this happened last year too - where mistletoebird was only detected in site 13.
Despite the apparent 'lack' of 'habitat' in the paddocks, there are a number of birds that are found in the paddock and nowhere else. Different birds have different requirements, some of which include open grassland areas and some are resilient to degraded landscapes. For example, the Australasian Pipit is a grassland insectivore that is mostly usually on the ground in open pasture or native grassland. It nests on the ground in tussocks.
We only found Topknot piegon in the rainforest, which is unsurprising since they are a rainforest species (as well as wet sclerophyll) especially found along creeks and rivers. They are largely frugivores (eating fruit and berries, but also seeds), and can eat (and disperse) hard to digest fruits. They are an important fruit disperser for rainforest plants.
These are just a couple of examples of how unique species relate to the vegetation and landscape. You could look into more of the unique species' ecology for your report.
## Diversity indices
We can also calculate indices for diversity aside from richness. For instance, Simpsons's diversity index takes into account abundance as well as richness. Why do we care about abundance? Some species are not evenly distributed in the landscape, even if they are found in more than one place. This metric helps us take that into account.
```{r}
# calculate simpson's diversity index for each site
communities <- communities %>%
mutate(site = as.character(site))
simpson <- as.data.frame(diversity(birds_vegan, index = "simpson")) %>%
rownames_to_column() %>%
rename(site = 1,
simpsons = 2) %>%
left_join(communities)
simpson2 <- diversity(birds_vegan, index = "simpson")
# plot simpsons by community
ggplot(simpson) +
geom_boxplot(aes(community, simpsons, fill = community))+
scale_fill_viridis_d()+
theme_minimal() +
theme(axis.text.x = element_blank())+
ylim(c(0.6, 1))
# test difference
simpson_aov <- aov(simpson2 ~ community, data = communities)
anova(simpson_aov)
```
Note the y-axis is limited to a minimum of 0.6 to make this easier to read.
The moist gully rainforest had the highest average Simpson's diversity index, which means it is more biodiverse in terms of both species richness and abundance. There was no significant in Simpson's diversity index among the communities.
## Bird communities
**(This section is interesting but you don't have to include it in your report)**
So far we have looked at the bird richness by the floristic communities. But what about the bird communities? That is, which species tend to be together due to shared habitat features they prefer (or avoid).
We'll use a constrained ordination to determine bird communities. An ordination examines communities across space and a contrained ordination examines communities across specific environmental variables (in our case, biometric data). The type we are doing is called 'Canonical Correspondence Analysis' (CCA).
For more information on how this works you can use the same tutorial I did to produce this: https://rpubs.com/an-bui/vegan-cheat-sheet
```{r}
# exclude paddocks because way out on their own
birds_vegan2 <- birds_vegan %>%
rownames_to_column() %>%
filter(!rowname %in% c("P1","P2","P3")) %>%
column_to_rownames()
# CCA
birdCCA <- cca(birds_vegan2 ~ Forb + Grass + Litter + Midstory + Overstory +
Cover + Trees + Hollows + CWD, data = biometric_data)
# extracting the bird community data
CCA_bird <- as.data.frame(birdCCA$CCA$v) %>%
rownames_to_column()%>%
rename(bird = rowname)
# plot
ggplot() +
geom_point(data = CCA_bird, aes(x = CCA1, y = CCA2), color = "darkblue")+
geom_text_repel(data = CCA_bird, aes(x = CCA1, y = CCA2, label = bird),
nudge_y = -0.05) +
theme_minimal()
```
Awesome! Let's read this figure. You can essentially ignore the axes here (arbitrary numbers in ordination space). What we are interested in is the grouping of birds. What this figure shows us is how birds cluster in the landscape according to the vegetation features (at least the ones we measured). And it looks pretty good! Not all the birds are labelled because the ones in the middle are so overlapping (read: likely to co-occur) that they didn't get shown.
There are plenty of groupings I would expect to see here, for example Magpie, Willie Wagtail, Jacky Winter, Pipit, Nankeen Kestrel and Magpie-lark are all grassland birds, which were detected together in the paddock/grassland sites.
Likewise, Topknot Pigeon, Eastern Yellow Robin, Rose Robin and Superb Lyrebird all prefer rainforest. The Grey Butcherbird is an oddity here, as they aren't a rainforest specalist.
It is very interesting that the two fairywren species are similarly distributed in ordination space - this indicates we found them in similar vegetation.
## Student estimates of bird richness
**(This section is interesting but you don't have to include it in your report)**
And out of interest, how well did you do at estimating the bird richness at each site? Let's compare average expert and student counts. These are total (i.e. all within 50m and >50m).
```{r}
# read in data
birds_student <- read_excel("data/ENVS2018 kioloa bird data 2024.xlsx", sheet = 2) %>%
clean_names() %>%
mutate(estimate = as.numeric(estimate)) %>%
select(3:4) %>%
rename(student = 2) %>%
pivot_longer(2, names_to = "observer", values_to = "birds")
# combine expert and student
birds_compare <- birds_richness %>%
select(c(1,5)) %>%
rename(expert = 2) %>%
pivot_longer(2, names_to = "observer", values_to = "birds") %>%
rbind(birds_student)
# plot expert and student
ggplot(birds_compare)+
geom_boxplot(aes(site, birds, fill = observer), alpha = .8,
position=position_dodge()) +
theme_minimal()+
theme(axis.text.x = element_text(angle=45))
```
In general, you underestimated the total number of species- which is to be expected! It takes a lot of practice and patience to learn bird calls. I hope you enjoyed coming out on the bird surveys with us and I appreciated your enthusiasm and quick learning.
## Acoustic data
**(This section is interesting but you don't have to include it in your report)**
You also surveyed the sites using acoustic meters. We collected data for two hours around dusk and dawn for one night. One of the benefits of acoustic meters is they can help sample diversity that otherwise would be missed in the time constraints of a five-minute count. The other benefit was to teach you how to use this kit and read sonograms! Let's see what we detected. First we'll look at the effect of time of day.
```{r}
# read in
acoustic <- read_excel("data/ENVS2018 kioloa acoustic data 2024.xlsx", sheet = 1) %>%
clean_names() %>%
mutate(site = as.character(site))
acoustic_rich <- acoustic %>% group_by(site,time) %>%
summarise(richness = length(unique(common_name))) %>%
left_join(communities) %>%
mutate(type = "acoustic")
# plot by site and dusk/dawn
ggplot(acoustic_rich)+
geom_boxplot(aes(time, richness, fill = community))+
facet_wrap(~community)+
scale_fill_viridis_d()+
theme_minimal()
# test difference dusk and dawn
tod <- aov(richness ~ time, data = acoustic_rich)
anova(tod)
```
Interestingly, some sites recorded higher richness at dusk than dawn. Note, this also includes frog diversity.
There was no difference in the richness recorded between dusk and dawn.
How did the richness detected on the acoustic meters compare with the bird surveys?
```{r}
# combine acoustic and expert bird counts
acoustic_compare <- select(birds_richness, c(1,3)) %>%
mutate(site = as.character(site)) %>%
rename(richness = 2) %>%
mutate(type = "expert") %>%
left_join(communities) %>%
full_join(select(acoustic_rich, !2))
# plot comparison
ggplot(acoustic_compare) +
geom_boxplot(aes(type, richness, fill = community))+
scale_fill_viridis_d()+
facet_wrap(~community)+
theme_minimal()
# test difference
aco <- aov(richness ~ type, data = acoustic_compare)
anova(aco)
```
Wow! We detected significantly more bird species on the acoustic meters than in the 5-minute expert counts (only considering the species within the 50m radius).
A caveat here is that we simply used 'loudness' of the acoustic data as a proxy for distance, so the two aren't directly comparable. For example, the Masked Lapwing heard on the acoustic meter might have been further away than 50m because their call carries. We could have tested this by playing a known volume tone at 50m from the meters to figure out how 'loud' something needs to be to be in the site.
You can imagine the addition of time to a bird survey as a curve of diminishing returns- you add more time, you get more species, but at a decaying rate of increase over time. So we could achieve higher richness counts by doing say, a 15-minute or even 1-hour bird survey (imagine that!), but that would not be a good use of people-time. This is where acoustic meters can come in handy.
Another important use of acoustic meters is to detect rare things that don't call often. If you were tasked with finding a Night Parrot, you would have to spend 1000s of hours in the remote spinifex desert to potentially hear a short call; which is why agencies like Parks and NGOs such as AWC instead deploy acoustic meters and then use AI to scan through in search of the target species.
How about the beta diversity between the acoustic and expert counts?
```{r}
# for loop of birds only found in that community
aco_beta <- data.frame()
species2 <- species %>% filter(present_in==1)
for (i in 1:24) {
uni_exp <- species2 %>% filter(site == sites[i])
uni_aco <- acoustic %>% filter(site == sites[i])
diff1 <- setdiff(uni_exp$common_name, uni_aco$common_name) %>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "expert",
site = sites[i])
diff2 <- setdiff(uni_aco$common_name, uni_exp$common_name)%>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "acoustic",
site = sites[i])
aco_beta <- rbind(aco_beta, diff1, diff2)
}
aco_beta2 <- aco_beta %>%
mutate(site = as.character(site)) %>%
group_by(site, source) %>%
summarise(beta = length(unique(unique))) %>%
left_join(communities)
# plot difference
ggplot(aco_beta2)+
geom_boxplot(aes(source, beta, fill=community))+
scale_fill_viridis_d()+
facet_wrap(~community)+
theme_minimal()
# test difference
aco_b <- aov(beta ~ source, data = aco_beta2)
anova(aco_b)
```
While we found unique species between both methods, there were significantly more unique species found on the acoustic meters that we didn't detect during the expert counts (if we only consider expert counts within the site- this is important because if we include birds outside of the sites we actually detect more birds with the expert surveys. The people can hear further than the units).
Aside from the study sites, were there any unique species found only on the expert counts or acoustic meters across the whole campus?
```{r}
b_exp <- setdiff(species2$common_name, acoustic$common_name) %>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "expert")
b_aco <- setdiff(acoustic$common_name, species2$common_name)%>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "acoustic")
aco_beta3 <- rbind(b_exp, b_aco)
print(aco_beta3)
```
Cool! Some of these species were detected during the bird surveys but not within the site itself (i.e. heard at a distance instead of within 50m). For example, we heard Red Wattlebird and Brown Cuckoo-dove during the expert counts but not in the sites. This may be an indication that the acoustic meters are hearing "too far". An excellent example of this is the sea-eagle, which nest just outside site 11. We heard them on the expert counts but did not count them within the sites.
What about out of the sites? Let's sum up the total richness of birds recorded across the whole campus from both methods
```{r}
# all bird species
species_expert <- data.frame(species = unique(birds_expert$common_name),
type = "expert")
species_acoustic <- data.frame(species = unique(acoustic$common_name),
type = "acoustic")
species_all <- rbind(species_expert, species_acoustic) %>%
mutate(type = ifelse(duplicated2(species, bothWays=TRUE)==TRUE,"both",type)) %>%
#remove the frog
filter(!species == "Common Eastern Froglet") %>%
distinct()
# make gt table
species_all %>%
group_by(type) %>%
gt() %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body())
species_all2 <- species_all %>%
group_by(type) %>%
summarise(count = length(species))
# plot
ggplot(species_all2,aes(1,count,fill=type))+
geom_bar(stat="identity")+
scale_fill_viridis_d()+
theme_minimal()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
```
We detected a total of 84 bird species (up from 76 in 2023). Of these, 54 were detected on both the acoustic meters and in the bird surveys, where 18 were only detected on the expert counts and 12 in the acoustics.
Species that were uniquely recorded on the acoustic meters (and not in the total expert counts including out of sites) were: yellow-rumped thornbill, white-eared honeyeater, bassian thrush and swamphen. These 5 species were added to the total site richness thanks to the acoustic meter! Note: brush cuckoo and pacific koel were not verified by expert ears and instead were included by AI - I retained these due to likelihood of occurrence, but I'm still not 100% if these are "real" records.
The acoustic meter was our only record of *Litoria quiritatus*, the screaming tree frog - which was only recently split from the bleating tree frog based on citizen science data submitted to the Australian Museum's Frog ID app.
Excitingly, we got our first ever record of Green Catbird during the expert counts! Green Catbird is a rainforest specialist related to bowerbirds, which have a distinct cat-like "yowling" call. Like the bowerbirds, they are frugivores and important seed dispersers.
# Other wildlife results
**(This section is interesting but you don't have to include it in your report)**
You don't have to report on this section, but we wanted to show you some interesting findings.
## Cameras
Let's have a look at how our camera results compared with previous years.
```{r}
# groups
mammals <- c("Eastern Grey Kangaroo", "Swamp Wallaby", "Red-necked Wallaby",
"Short-beaked Echidna", "Common Wombat",
"Brush-tailed Possum", "Krefft's Glider",
"Black Rat", "Bush Rat", "Rat sp", "Swamp Rat", "House Mouse",
"Long-nosed Bandicoot", "White-footed Dunnart", "Brown Antechinus",
"Cat", "Red Fox", "Pig" , "Sheep", "Cow")
# read in data
camera <- read_excel("data/ENVS2018 kioloa data 2017-2024.xlsx", sheet = 2) %>%
clean_names() %>%
distinct() %>%
#remove skinks
filter(!common_name == "Delicate Sun-skink") %>%
mutate(taxa = ifelse(common_name %in% mammals, "mammal", "bird")) %>%
filter(taxa == "mammal") %>%
mutate(common_name = factor(common_name, levels = mammals))
# plot mammals
ggplot(camera)+
geom_point(aes(year, common_name, colour = common_name), size = 4)+
scale_colour_viridis_d()+
theme_minimal()+
theme(legend.position= "none")+
theme(axis.title.y=element_blank())+
scale_y_discrete(limits=rev)+
scale_x_continuous(n.breaks = 7)+
coord_fixed(ratio = 0.6)
```
Here we can see which species were seen in which years. The occurrence of these species might relate to environmental conditions (e.g. drought). Note: the cameras were not placed in rainforest and coastal locations in 2017-2020, only upper and lower slope eucalypt forest and paddock. Also note there was a bushfire in the 2019/2020 Summer. No data was collected in 2021 (covid-19).
## Herpetofauna
Let's do the same thing with the reptiles and frogs (from tin/tile and opportunistic records).
```{r}
# read in data
herpsp <- c("Eastern Small-eyed Snake", "Red-bellied Black Snake", "Diamond Python",
"Jacky Dragon", "Blue-tongued Lizard", "Delicate Sun-skink",
"Weasel Skink", "Skink sp", "Lace Monitor",
"Whistling Tree Frog", "Jervis Bay Tree Frog",
"Peron's Tree Frog", "Blue Mountains Tree Frog",
"Eastern Dwarf Tree Frog", "Southern Green Stream Frog",
"Screaming Tree Frog", "Striped Marsh Frog", "Tyler's Toadlet",
"Common Eastern Froglet", "Haswell's Froglet")
herps <- read_excel("data/ENVS2018 kioloa data 2017-2024.xlsx", sheet = 1) %>%
clean_names() %>%
select(2:3) %>%
# remove mice
filter(!common_name == "House Mouse") %>%
distinct() %>%
mutate(common_name = factor(common_name, levels = herpsp))
# plot
ggplot(herps)+
geom_point(aes(year, common_name, colour = common_name), size = 4)+
scale_colour_viridis_d()+
theme_minimal()+
theme(legend.position= "none")+
theme(axis.title.y=element_blank())+
scale_y_discrete(limits=rev)+
scale_x_continuous(n.breaks = 7)+
coord_fixed(ratio = 0.6)
```
This is really an indication of the survey effort for herpetofauna! In 2018-2020 we only had tin/tile (AKA substrate) checks (which were done by students instead of expert observers) whereas in 2022 we started recording incidental sightings and doing the pond frog surveys.
We recorded two new species for ENVS2018 in 2024: lace monitor (*Varanus varius*) and the Blue Mountains Tree Frog (*Litoria citropa*).
I wonder how these results compare with the frog diversity in the pre-readings?
## Birds over time
Lastly, let's look at how the bird diversity has changed over the years (remember you do not need to include this in your report).
Note: the casuarina forest and rainforest sites were added in 2022.
```{r}
# read in name changes
site_names <- read_excel("data/KCC new and old gps.xlsx") %>%
select(1:2) %>%
rename(old_name = 1,
new_name = 2)
# vector of old site names
old <- unique(site_names$old_name)
# vector of new site names
new <- 1:24