Skip to content

Commit

Permalink
badge generation fix
Browse files Browse the repository at this point in the history
  • Loading branch information
sigmafelix committed Nov 5, 2024
1 parent 260c7b0 commit c5e2ed3
Show file tree
Hide file tree
Showing 9 changed files with 91 additions and 77 deletions.
22 changes: 15 additions & 7 deletions .github/workflows/test-coverage-local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,13 +77,21 @@ jobs:
export COV=${{ steps.get-values.outputs.coverage }}
COLOR=$(node -p '+process.env.COV >= 95 ? `green` : `orange`')
mkdir -p badges
badgen -j coverage -s $COV% -c $COLOR > .github/badges/coverage.svg
badgen -j coverage -s $COV% -c $COLOR > ${{ github.workspace }}/badges/coverage.svg
- name: Deploy Badges
uses: stefanzweifel/git-auto-commit-action@v4
- name: Checkout artifacts branch
uses: actions/checkout@v4
with:
commit_message: "Update badges [skip ci]"
branch: main
file_pattern: .github/badges/coverage.svg

ref: artifacts
persist-credentials: false

- name: Commit and Push Badge to Artifacts Branch
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
mkdir -p actions/workflows/test-coverage-local.yaml
mv badges/coverage.svg actions/workflows/test-coverage-local.yaml/coverage.svg
git add actions/workflows/test-coverage-local.yaml/coverage.svg
git commit -m "Update coverage badge [skip ci]"
git push origin artifacts
if: success()
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ knitr::opts_chunk$set(

<!-- badges: start -->
<!-- [![cov](https://docs.ropensci.org/chopin/badges/coverage.svg)](https://github.com/ropensci/chopin/actions) -->
![Coverage](https://raw.githubusercontent.com/ropensci/chopin/main/.github/badges/coverage.svg)
![Coverage](https://github.com/ropensci/chopin/actions/workflows/test-coverage-local.yaml/coverage.svg)
[![R-CMD-check](https://github.com/ropensci/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/ropensci/chopin/actions/workflows/check-standard.yaml)
[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/638_status.svg)](https://github.com/ropensci/software-review/issues/638)
[![runiverse-package](https://ropensci.r-universe.dev/badges/chopin?scale=1&color=pink&style=round)](https://ropensci.r-universe.dev/chopin)
Expand Down
144 changes: 75 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
# Computation of Spatial Data by Hierarchical and Objective Partitioning of Inputs for Parallel Processing <a href="https://docs.ropensci.org/chopin/"><img src="man/figures/logo.svg" align="right" height="210" alt="overlapping irregular grid polygons filled with orange, green, and teal" /></a>

<!-- badges: start -->

<!-- [![cov](https://docs.ropensci.org/chopin/badges/coverage.svg)](https://github.com/ropensci/chopin/actions) -->

![Coverage](https://raw.githubusercontent.com/ropensci/chopin/main/.github/badges/coverage.svg)
![Coverage](https://github.com/ropensci/chopin/actions/workflows/test-coverage-local.yaml/coverage.svg)
[![R-CMD-check](https://github.com/ropensci/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/ropensci/chopin/actions/workflows/check-standard.yaml)
[![Status at rOpenSci Software Peer
Review](https://badges.ropensci.org/638_status.svg)](https://github.com/ropensci/software-review/issues/638)
Expand Down Expand Up @@ -55,19 +56,21 @@ multiple raster file paths into `par_multirasters()`. **Finally,** users
run `par_*()` function with the configurations set above to compute
spatial variables from input data in parallel:

- `par_grid`: parallelize over artificial grid polygons that are
generated from the maximum extent of inputs. `par_pad_grid` is used to
generate the grid polygons before running this function.
- `par_grid`: parallelize over artificial grid polygons that are
generated from the maximum extent of inputs. `par_pad_grid` is used
to generate the grid polygons before running this function.

- `par_hierarchy`: parallelize over hierarchy coded in identifier
fields (for example, census blocks in each county in the US)

- `par_hierarchy`: parallelize over hierarchy coded in identifier fields
(for example, census blocks in each county in the US)
- `par_multirasters`: parallelize over multiple raster files

- `par_multirasters`: parallelize over multiple raster files
- Each of the `par_*` functions introduced above has `mirai` version
with a suffix `_mirai` after the function names: `par_grid_mirai`,
`par_hierarchy_mirai`, and `par_multirasters`. These functions will
work properly after creating daemons with `mirai::daemons`.

- Each of the `par_*` functions introduced above has `mirai` version
with a suffix `_mirai` after the function names: `par_grid_mirai`,
`par_hierarchy_mirai`, and `par_multirasters`. These functions will
work properly after creating daemons with `mirai::daemons`.
<!-- end list -->

``` r
mirai::daemons(4L, dispatcher = "process")
Expand All @@ -87,14 +90,14 @@ classes for spatial data. Raster-vector overlay is done with
`exactextractr`. Three helper functions encapsulate multiple geospatial
data calculation steps over multiple CPU threads.

- `extract_at`: extract raster values with point buffers or polygons
with or without kernel weights
- `extract_at`: extract raster values with point buffers or polygons
with or without kernel weights

- `summarize_sedc`: calculate sums of [exponentially decaying
contributions](https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html)
- `summarize_sedc`: calculate sums of [exponentially decaying
contributions](https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html)

- `summarize_aw`: area-weighted covariates based on target and reference
polygons
- `summarize_aw`: area-weighted covariates based on target and
reference polygons

### Function selection guide

Expand All @@ -105,32 +108,33 @@ users with large vector data.

In **raster-oriented selection**, we suggest four factors to consider:

- Number of raster files: for multiple files, `par_multirasters` is
recommended. When there are multiple rasters that share the same
extent and resolution, consider stacking the rasters into multilayer
SpatRaster object by calling `terra::rast(filenames)`.
- Raster resolution: We suggest 100 meters as a threshold. Rasters with
resolution coarser than 100 meters and a few layers would be better
for the direct call of `exactextractr::exact_extract()`.
- Raster extent: Using `SpatRaster` in `exactextractr::exact_extract()`
is often minimally affected by the raster extent.
- Memory size: `max_cells_in_memory` argument value of
`exactextractr::exact_extract()`, raster resolution, and the number of
layers in `SpatRaster` are multiplicatively related to the memory
usage.
- Number of raster files: for multiple files, `par_multirasters` is
recommended. When there are multiple rasters that share the same
extent and resolution, consider stacking the rasters into multilayer
SpatRaster object by calling `terra::rast(filenames)`.
- Raster resolution: We suggest 100 meters as a threshold. Rasters
with resolution coarser than 100 meters and a few layers would be
better for the direct call of `exactextractr::exact_extract()`.
- Raster extent: Using `SpatRaster` in
`exactextractr::exact_extract()` is often minimally affected by the
raster extent.
- Memory size: `max_cells_in_memory` argument value of
`exactextractr::exact_extract()`, raster resolution, and the number
of layers in `SpatRaster` are multiplicatively related to the memory
usage.

![](man/figures/README-flowchart-raster.png)

For **vector-oriented selection**, we suggest three factors to consider:

- Number of features: When the number of features is over 100,000,
consider using `par_grid` or `par_hierarchy` to split the data into
smaller chunks.
- Hierarchical structure: If the data has a hierarchical structure,
consider using `par_hierarchy` to parallelize the operation.
- Data grouping: If the data needs to be grouped in similar sizes,
consider using `par_pad_balanced` or `par_pad_grid` with
`mode = "grid_quantile"`.
- Number of features: When the number of features is over 100,000,
consider using `par_grid` or `par_hierarchy` to split the data into
smaller chunks.
- Hierarchical structure: If the data has a hierarchical structure,
consider using `par_hierarchy` to parallelize the operation.
- Data grouping: If the data needs to be grouped in similar sizes,
consider using `par_pad_balanced` or `par_pad_grid` with `mode =
"grid_quantile"`.

![](man/figures/README-flowchart-vector.png)

Expand Down Expand Up @@ -174,7 +178,7 @@ library(dplyr)
#>
#> intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(terra)
#> terra 1.7.83
library(future)
Expand Down Expand Up @@ -261,7 +265,7 @@ system.time(
)
#> Input is a character. Attempt to read it with terra::rast...
#> user system elapsed
#> 5.738 0.026 5.764
#> 5.458 0.088 5.580
```

#### Generate regular grid computational regions
Expand Down Expand Up @@ -348,7 +352,7 @@ system.time(
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Task at CGRIDID: 4 is successfully dispatched.
#> user system elapsed
#> 0.243 0.019 6.303
#> 0.336 0.045 7.788

ncpoints_srtm <-
extract_at(
Expand Down Expand Up @@ -406,7 +410,7 @@ system.time(
)
#> ℹ Input is not a character.
#> user system elapsed
#> 0.063 0.001 6.246
#> 0.083 0.000 8.004

# remove mirai::daemons
mirai::daemons(0L)
Expand Down Expand Up @@ -435,7 +439,7 @@ path_nchrchy <- file.path(wdir, "nc_hierarchy.gpkg")
nc_data <- path_nchrchy
nc_county <- sf::st_read(nc_data, layer = "county")
#> Reading layer `county' from data source
#> `/tmp/RtmpgQJVzZ/temp_libpath1ac4c210320fb/chopin/extdata/nc_hierarchy.gpkg'
#> `/tmp/RtmpmJUPd2/temp_libpath3b3aa268543/chopin/extdata/nc_hierarchy.gpkg'
#> using driver `GPKG'
#> Simple feature collection with 100 features and 1 field
#> Geometry type: POLYGON
Expand All @@ -444,7 +448,7 @@ nc_county <- sf::st_read(nc_data, layer = "county")
#> Projected CRS: NAD83 / Conus Albers
nc_tracts <- sf::st_read(nc_data, layer = "tracts")
#> Reading layer `tracts' from data source
#> `/tmp/RtmpgQJVzZ/temp_libpath1ac4c210320fb/chopin/extdata/nc_hierarchy.gpkg'
#> `/tmp/RtmpmJUPd2/temp_libpath3b3aa268543/chopin/extdata/nc_hierarchy.gpkg'
#> using driver `GPKG'
#> Simple feature collection with 2672 features and 1 field
#> Geometry type: MULTIPOLYGON
Expand Down Expand Up @@ -474,7 +478,7 @@ system.time(
)
#> Input is a character. Attempt to read it with terra::rast...
#> user system elapsed
#> 0.712 0.000 0.712
#> 0.621 0.010 0.613

# hierarchical parallelization
system.time(
Expand Down Expand Up @@ -592,7 +596,7 @@ system.time(
#> Input is a character. Attempt to read it with terra::rast...ℹ Your input function at 37055 is dispatched.
#> Input is a character. Attempt to read it with terra::rast...ℹ Your input function at 37047 is dispatched.
#> user system elapsed
#> 0.262 0.064 5.886
#> 0.419 0.033 7.799
```

### `par_multirasters()`: parallelize over multiple rasters
Expand All @@ -619,9 +623,9 @@ terra::writeRaster(ncelev, file.path(tdir, "test5.tif"), overwrite = TRUE)
# check if the raster files were exported as expected
testfiles <- list.files(tdir, pattern = "*.tif$", full.names = TRUE)
testfiles
#> [1] "/tmp/Rtmp4XLJrC/test1.tif" "/tmp/Rtmp4XLJrC/test2.tif"
#> [3] "/tmp/Rtmp4XLJrC/test3.tif" "/tmp/Rtmp4XLJrC/test4.tif"
#> [5] "/tmp/Rtmp4XLJrC/test5.tif"
#> [1] "/tmp/RtmpiW9fAm/test1.tif" "/tmp/RtmpiW9fAm/test2.tif"
#> [3] "/tmp/RtmpiW9fAm/test3.tif" "/tmp/RtmpiW9fAm/test4.tif"
#> [5] "/tmp/RtmpiW9fAm/test5.tif"
```

``` r
Expand All @@ -638,32 +642,32 @@ system.time(
)
#> ℹ Input is not a character.
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Your input function at /tmp/Rtmp4XLJrC/test1.tif is dispatched.
#> ℹ Your input function at /tmp/RtmpiW9fAm/test1.tif is dispatched.
#>
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Your input function at /tmp/Rtmp4XLJrC/test2.tif is dispatched.
#> ℹ Your input function at /tmp/RtmpiW9fAm/test2.tif is dispatched.
#>
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Your input function at /tmp/Rtmp4XLJrC/test3.tif is dispatched.
#> ℹ Your input function at /tmp/RtmpiW9fAm/test3.tif is dispatched.
#>
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Your input function at /tmp/Rtmp4XLJrC/test4.tif is dispatched.
#> ℹ Your input function at /tmp/RtmpiW9fAm/test4.tif is dispatched.
#>
#> Input is a character. Attempt to read it with terra::rast...
#> ℹ Your input function at /tmp/Rtmp4XLJrC/test5.tif is dispatched.
#> ℹ Your input function at /tmp/RtmpiW9fAm/test5.tif is dispatched.
#> user system elapsed
#> 1.389 0.070 2.219
#> 1.288 0.090 2.901
knitr::kable(head(res))
```

| mean | base_raster |
|----------:|:--------------------------|
| 136.80203 | /tmp/Rtmp4XLJrC/test1.tif |
| 189.76170 | /tmp/Rtmp4XLJrC/test1.tif |
| 231.16968 | /tmp/Rtmp4XLJrC/test1.tif |
| 98.03845 | /tmp/Rtmp4XLJrC/test1.tif |
| 41.23463 | /tmp/Rtmp4XLJrC/test1.tif |
| 270.96933 | /tmp/Rtmp4XLJrC/test1.tif |
| mean | base\_raster |
| --------: | :------------------------ |
| 136.80203 | /tmp/RtmpiW9fAm/test1.tif |
| 189.76170 | /tmp/RtmpiW9fAm/test1.tif |
| 231.16968 | /tmp/RtmpiW9fAm/test1.tif |
| 98.03845 | /tmp/RtmpiW9fAm/test1.tif |
| 41.23463 | /tmp/RtmpiW9fAm/test1.tif |
| 270.96933 | /tmp/RtmpiW9fAm/test1.tif |

``` r

Expand Down Expand Up @@ -699,7 +703,7 @@ pnts <- sf::st_as_sf(pnts)
pnts$pid <- sprintf("RPID-%04d", seq(1, 5000))
rd1 <- sf::st_read(path_ncrd1)
#> Reading layer `ncroads_first' from data source
#> `/tmp/RtmpgQJVzZ/temp_libpath1ac4c210320fb/chopin/extdata/ncroads_first.gpkg'
#> `/tmp/RtmpmJUPd2/temp_libpath3b3aa268543/chopin/extdata/ncroads_first.gpkg'
#> using driver `GPKG'
#> Simple feature collection with 620 features and 4 fields
#> Geometry type: MULTILINESTRING
Expand Down Expand Up @@ -752,11 +756,11 @@ system.time(
restr <- terra::nearest(x = terra::vect(pntst), y = terra::vect(rd1t))
)
#> user system elapsed
#> 0.461 0.000 0.462
#> 0.396 0.000 0.397

pnt_path <- file.path(tdir, "pntst.gpkg")
sf::st_write(pntst, pnt_path)
#> Writing layer `pntst' to data source `/tmp/Rtmp4XLJrC/pntst.gpkg' using driver `GPKG'
#> Writing layer `pntst' to data source `/tmp/RtmpiW9fAm/pntst.gpkg' using driver `GPKG'
#> Writing 5000 features with 1 fields and geometry type Point.

# we use four threads that were configured above
Expand Down Expand Up @@ -802,11 +806,13 @@ system.time(
#> ℹ Input is a character. Trying to read with terra .
#> ℹ Task at CGRIDID: 8 is successfully dispatched.
#> user system elapsed
#> 0.070 0.000 0.392
#> 0.058 0.010 0.401
```

- We will compare the results from the single-thread and multi-thread
calculation.
- We will compare the results from the single-thread and multi-thread
calculation.

<!-- end list -->

``` r
resj <- merge(restr, resd, by = c("from_x", "from_y"))
Expand Down
Binary file modified man/figures/README-compare-compregions-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-load-srtm-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-map-all-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-plot results-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-plot results-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-read-nc-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c5e2ed3

Please sign in to comment.