-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
206 lines (156 loc) · 5.72 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# imhen
<!-- badges: start -->
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/epix-project/imhen?branch=master&svg=true)](https://ci.appveyor.com/project/epix-project/imhen)
[![Travis build status](https://travis-ci.org/epix-project/imhen.svg?branch=master)](https://travis-ci.org/epix-project/imhen)
<!-- badges: end -->
| | |
| -------- | ----------------------------------------------------------------- |
| <img src="vignettes/imhen.png" alt="" style="width: 100px;"/> | This package contains meteorological data for Vietnam from the Vietnamese Institute of Meteorology, Hydrology and Environment ([IMHEN](http://vnclimate.vn/en/about/imhen/)). This is monthly data in 67 climatic stations from January 1960 to December 2015. Climatic variables are min, max, average temperatures, absolute and relative humidities, rainfall and hours of sunshine. |
## Installation and loading
You can install `imhen` from [GitHub](https://github.com/epix-project/imhen)
```{r eval = FALSE}
# install.packages("devtools")
devtools::install_github("epix-project/imhen", build_vignettes = TRUE)
```
Once installed, you can load the package:
```{r}
library(imhen)
```
## Usage examples
The package contains two dataframes. The first one is `meteo` which contains the
climatic variables `Tx`, `Ta`, `Tm`, `aH`, `rH`, `Rf` and `Sh` plus time (`year`
and `month`) and space (`station`) information:
```{r}
head(meteo)
```
Note that the data frame is not "complete", with some combinations of the `year`,
`month` and `station` being missing:
```{r}
table(with(meteo, table(year, month, station)))
```
The second one is `stations` which contains the coordinates (`longitude` and
`latitude`) and the `elevation`:
```{r}
head(stations)
```
### Mapping the climatic stations
We can transform the climatic stations coordinates into a spatial object:
```{r eval=F}
library(gadmVN)
vietnam <- gadm(level = "country")
coordinates(stations) <- ~ longitude + latitude
proj4string(stations) <- vietnam@proj4string
```
And plot the stations on the map:
```{r eval=FALSE}
plot(vietnam, col = "grey")
points(stations, col = "blue", pch = 3)
```
### Visualizing the climatic stations elevations
We can also look at the elevations of the climatic stations:
```{r}
plot(sort(stations$elevation, TRUE), type = "o",
xlab = "stations ranked by decreasing elevation", ylab = "elevation (m)")
```
### Exploring the climatic variables
Let's look at the temperatures:
```{r fig.height = .5 * 3.5, fig.width = 1.3 * 4.24725}
val <- c("Tm", "Ta", "Tx")
T_range <- range(meteo[, val], na.rm = TRUE)
breaks <- seq(floor(T_range[1]), ceiling(T_range[2]), 2)
par(mfrow = c(1, 3))
for(i in val)
hist(meteo[[i]], breaks, ann = FALSE, col = "lightgrey", ylim = c(0, 10500))
```
Looks good. Let's check the consistency of the values:
```{r}
for(i in val) print(range(meteo[[i]], na.rm = TRUE))
with(meteo, any(!((Tm <= Ta) & (Ta <= Tx)), na.rm = TRUE))
```
Let's look at the other variables:
```{r}
val <- c("aH", "rH", "Rf", "Sh")
par(mfrow = c(2, 2))
for(i in val) hist(meteo[[i]], col = "lightgrey", ann = FALSE)
```
Looks good too.
```{r}
for(i in val) print(range(meteo[[i]], na.rm = TRUE))
```
### Visualizing the data spatio-temporally
Let's first Make a `year`, `month`, `station` template for a full design of the
data:
```{r eval=FALSE}
y <- sort(unique(meteo$year))
m <- factor(levels(meteo$month), levels(meteo$month), ordered = TRUE)
s <- stations$station[order(coordinates(stations)[, "latitude"])]
s <- factor(s, s, ordered = TRUE)
template <- setNames(expand.grid(y, m, s), c("year", "month", "station"))
attr(template, "out.attrs") <- NULL # removing useless attributes
```
The full version of the data:
```{r eval=FALSE}
meteo_full <- merge(template, meteo, all.x = TRUE)
```
Let's visualize it:
```{r eval=FALSE}
x <- as.Date(with(unique(meteo_full[, c("year", "month")]),
paste0(year, "-", as.numeric(month), "-15")))
y <- seq_along(stations)
nb <- length(y)
col <- rev(heat.colors(12))
show_data <- function(var) {
image(x, y, t(matrix(meteo_full[[var]], nb)), col = col,
xlab = NA, ylab = "climatic stations")
box(bty = "o")
}
```
Missings values for all the temperature variables:
```{r eval=FALSE}
opar <- par(mfrow = c(2, 2))
for(i in c("Tx", "Ta", "Tm")) show_data(i)
par(opar)
```
Showing very well the higher seasonality in the north than in the south. Missing
values for the absolute and relative humidities as well as for rainfall and
hours of sunshine:
```{r eval=FALSE}
opar <- par(mfrow = c(2, 2))
for(i in c("aH", "rH", "Rf", "Sh")) show_data(i)
par(opar)
```
Showing strong seasonality of absolute humidity in the north of the country,
interesting pattern of relative humidity in the center of the country, high
rainfalls in the fall in the center of the country, and out-of-phase
oscillations of the number of hours of sunshine between the north and the south
of the country. It seems though that there are strange outliers in sunshine in
the north in 2008 or so. Let's now combine the missing values from all the
climatic variables:
```{r eval=FALSE}
library(magrittr)
library(dplyr)
meteo_full %<>% mutate(combined = is.na(Tx + Ta + Tm + aH + rH + Rf + Sh))
show_data("combined")
abline(v = as.Date("1995-01-01"))
```
The locations of the 6 stations with missing value in the recent year are:
```{r eval=FALSE}
subset(meteo_full, year > 1994 & combined, station, TRUE) %>% unique
```
## Left to do
- pairwise distances
- time series (trends seasonalities)
- time seasonal variation
- PCA?