Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate densities from point patterns #27

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions source/point-pattern-R.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
---
title: Estimating densities from a point pattern
maintainer: Thierry Onkelinx
date: '2017-06-28'
output: html_document
params:
cellsize: 20
---

In this example we focus on a set of 10450 coordinates in a small area. The goal is to estimate the local density of points, expressed as the number of point per unit area. The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate system. That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate system. In this case we use [Lambert72 (EPSG:3170)](https://epsg.io/31370). Next we calculate the density. To visualise the density, we have to transform the results back in to WGS84.

The data used in this example is real data by centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628).

First we must read the data into R. Plotting the raw data helps to check errors in the data.

```{r read-data, fig.cap = "Raw data"}
points <- read.delim("point-pattern/points.txt", sep = " ")
library(ggmap)
map <- get_map(
location = c(lon = mean(points$lon), lat = mean(points$lat)),
zoom = 17,
maptype = "satellite",
source = "google"
)
ggmap(map) +
geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4)
```

The next step is to convert the dataset in to a `SpatialPoints` object with WGS84 project and re-project it into Lambert72. `sp::CRS()` defines the coordinate systems. `sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a coordinate system. Therefore we need to override the `proj4string` slot with the correct coordinate system. `sp::spTransform()` converts the spatial object from the current coordinate system to another coordinate system.

```{r reproject}
library(sp)
crs_wgs84 <- CRS("+init=epsg:4326")
crs_lambert <- CRS("+init=epsg:31370")
coordinates(points) <- ~lon + lat
points@proj4string <- crs_wgs84
points_lambert <- spTransform(points, crs_lambert)
```

Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area.

We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons.

```{r density}
library(MASS)
library(dplyr)
xlim <- range(pretty(points_lambert$lon)) + c(-100, 100)
ylim <- range(pretty(points_lambert$lat)) + c(-100, 100)
n <- c(
diff(xlim),
diff(ylim)
) / params$cellsize + 1
dens <- kde2d(
x = points_lambert$lon,
y = points_lambert$lat,
n = n,
lims = c(xlim, ylim)
)
dx <- diff(dens$x[1:2])
dy <- diff(dens$y[1:2])
sum(dens$z * dx * dy)
dens <- expand.grid(
lon = dens$x,
lat = dens$y
) %>%
mutate(
density = as.vector(dens$z) * length(points_lambert),
id = seq_along(density)
) %>%
merge(
data.frame(
x = dx * (c(0, 0, 1, 1, 0) - 0.5),
y = dy * (c(0, 1, 1, 0, 0) - 0.5)
)
) %>%
mutate(
lon = lon + x,
lat = lat + y
)
```

In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image.

```{r ggmap, fig.cap = "Static image of density"}
coordinates(dens) <- ~lon + lat
dens@proj4string <- crs_lambert
dens_wgs <- spTransform(dens, crs_wgs84) %>%
as.data.frame()
ggmap(map) +
geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) +
scale_fill_gradientn(
"density\n(#/m²)",
colours = rev(rainbow(100, start = 0, end = .7)),
limits = c(0, NA)
)
```

Using `leaflet` to generate a map was a bit more laborious. Using the `data.frame dens_wgs`directly failed. So we converted the `data.frame` in a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`. The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`. A `Polygons` object consist of a list of one or more `Polygon` object. In this example a single polygon which represents the grid cell.

```{r convert-to-Spatial-Polygons}
dens_sp <- lapply(
unique(dens_wgs$id),
function(i){
filter(dens_wgs, id == i) %>%
select(lon, lat) %>%
Polygon() %>%
list() %>%
Polygons(ID = i)
}
) %>%
SpatialPolygons() %>%
SpatialPolygonsDataFrame(
data = dens_wgs %>%
distinct(id, density),
match.ID = FALSE
)
```

`leaflet` requires a predefined function with a colour pallet. We use `leaflet::colorNumeric()` to get a continuous pallet. Setting `stroke = FALSE` removes the borders of the polygon. `fillOpacity` sets the transparency of the polygons.

```{r leaflet, fig.cap = "Dynamic map of density"}
library(leaflet)
pal <- colorNumeric(
palette = rev(rainbow(100, start = 0, end = .7)),
domain = c(0, dens_sp$density)
)
leaflet(dens_sp) %>%
addTiles() %>%
addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~density)
```

Loading