Skip to content

Commit

Permalink
Merge pull request nationalparkservice#118 from RobLBaker/master
Browse files Browse the repository at this point in the history
add function document_missing_values
  • Loading branch information
RobLBaker authored Jul 16, 2024
2 parents 268c7aa + 69bc74f commit 39ffe96
Show file tree
Hide file tree
Showing 29 changed files with 725 additions and 20 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ Imports:
stringr,
base,
readr,
lifecycle,
huxtable,
crayon,
data.table,
Expand All @@ -61,7 +60,8 @@ Imports:
sp,
withr,
cli,
purrr
purrr,
lifecycle
RoxygenNote: 7.3.1
Suggests:
knitr,
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ export(convert_datetime_format)
export(convert_long_to_utm)
export(convert_utm_to_ll)
export(create_datastore_script)
export(document_missing_values)
export(fix_utc_offset)
export(fuzz_location)
export(generate_ll_from_utm)
Expand Down
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# QCkit v0.1.8 (not yet released)

2024-07-16
* Added experimental function `document_missing_values()`, which searches a file for multiple missing value codes, replaces them all with NA, and generates a new column with the missing value codes so that they can be properly documented in EML. This is a work-around for the fact that there is currently not a good way to get multiple missing value codes in a single column via EMLassemblyline. This function is still under development; expect substantial changes an improvements up to and including removing the function entirely.

2024-07-09
* Added function `get_user_email()`, which accesses NPS active directory via a powershell function to return the user's email address. Probably won't work for non-NPS users and probably won't work for non-windows users.
* Updated rest API from legacy v6 to current v7.
Expand Down
83 changes: 83 additions & 0 deletions R/replace_blanks.R
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,86 @@ replace_blanks <- function(directory = here::here(), missing_val_code = NA) {
}
return(invisible())
}


#' Handles multiple missing values
#'
#' @description
#' `r lifecycle::badge("experimental")`
#' `r lifecycle::badge("questioning")`
#' Given a file name (.csv only) and path, the function will search the
#' columns for any that contain multiple user-specified missing value codes.
#' For any column with multiple missing value codes, all the missing values
#' (including blanks) will be replaced with NA. A new column will be generated
#' and, populated with the given missing value code from the origin column.
#' Values that were not missing will be populated with "not_missing". The
#' newly generate column of categorical variables can be used do describe
#' the various/multiple reasons for why data is absent in the original column.
#'
#' The function will then write the new dataframe to a file, overwriting the
#' original file. If it is important to keep a copy of the original file, make
#' a copy prior to running the function.
#'
#' WARNING: this function will replace any blank cells in your data with NA!
#'
#' @details Blank cells will be treated as NA.
#'
#' @param file_name String. The name of the file to inspect
#' @param directory String. Location of file to read/write. Defaults to the current working directory.
#' @param colname `r lifecycle::badge("experimental")` String. The columns to inspect. CURRENTLY ONLY WORKS AS SET TO DEFAULT "NA".
#' @param missing_val_codes List. A list of strings containing the missing value code or codes to search for.
#' @param replace_value String. The value (singular) to replace multiple missing values with. Defaults to NA.
#'
#' @return writes a new dataframe to file. Return invisible.
#' @export
#'
#' @examples
#' \dontrun{
#' document_missing_values(file_name = "mydata.csv",
#' directory = here::here(),
#' colname = NA, #do not change during function development
#' missing_val_codes = c("missing", "blank", "no data"),
#' replace_value = NA)
#' }
document_missing_values <- function(file_name,
directory = here::here(),
colname = NA,
missing_val_codes = NA,
replace_value = NA) {

#read in a dataframe:
df <- readr::read_csv(paste0(directory, "/", file_name),
show_col_types = FALSE)
#generate list of missing values
missing_val_codes <- append(missing_val_codes, NA)
missing_val_codes <- unique(missing_val_codes)

data_names <- colnames(df)

if (is.na(colname)) {
y <- ncol(df)
for (i in 1:y) {
#if here are multiple missing value codes in a column:
if (sum(df[[data_names[i]]] %in% missing_val_codes) >
sum(is.na(df[[data_names[i]]]))) {
#generate new column of data:
df$x <- with(df,
ifelse(df[[data_names[i]]] %in% missing_val_codes,
df[[data_names[i]]], "not_missing"))
#replace old missing values with replacement value
df[[data_names[i]]] = ifelse(df[[data_names[i]]] %in%
missing_val_codes,
replace_value, df[[data_names[i]]])
#rename new column:
names(df)[names(df) == "x"] <- paste0("custom_",
data_names[i],
"_MissingValues")
}
}
}
#write the file back out:
readr::write_csv(df, paste0(directory, "/", file_name))

return(invisible)

}
2 changes: 1 addition & 1 deletion docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/news/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ articles:
DRR_Purpose_and_Scope: DRR_Purpose_and_Scope.html
Starting-a-DRR: Starting-a-DRR.html
Using-the-DRR-Template: Using-the-DRR-Template.html
last_built: 2024-07-09T14:49Z
last_built: 2024-07-16T15:01Z

174 changes: 174 additions & 0 deletions docs/reference/document_missing_values.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 39ffe96

Please sign in to comment.