Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TADAModule1_AdvancedTraining.Rmd and fix MeasureQualifierCode bug #548

Merged
merged 4 commits into from
Dec 2, 2024

Conversation

cristinamullin
Copy link
Collaborator

No description provided.

@cristinamullin cristinamullin changed the title Update TADAModule1_AdvancedTraining.Rmd Update TADAModule1_AdvancedTraining.Rmd and fix MeasureQualifierCode bug Nov 21, 2024
@cristinamullin
Copy link
Collaborator Author

cristinamullin commented Nov 22, 2024

I am working on addressing these two unrelated errors:

── Warning ('test-URLChecker.R:4:5'): URLs are not broken ──────────────────────
file("") only supports open = "w+" and open = "w+b": using the former
Backtrace:1. ├─... %>% ... at test-URLChecker.R:32:3
  2. ├─base::setdiff(., c("https://www.itecmembers.org/attains/", "https://attains.epa.gov/attains-public/api/assessmentUnits?assessmentUnitIdentifier="))
  3. │ └─base::as.vector(x)
  4. ├─base::unique(.)
  5. ├─EPATADA (local) clean_url(.)
  6. │ ├─... %>% stringr::str_remove_all("[<>]") at test-URLChecker.R:9:5
  7. │ └─stringr::str_remove_all(url, "[\\\\.,\\\")]+$|[{}].*")
  8. │   └─stringr::str_replace_all(string, pattern, "")
  9. │     └─stringr:::check_lengths(string, pattern, replacement)
 10. │       └─vctrs::vec_size_common(...)
 11. ├─stringr::str_remove_all(., "[<>]")
 12. │ └─stringr::str_replace_all(string, pattern, "")
 13. │   └─stringr:::check_lengths(string, pattern, replacement)
 14. │     └─vctrs::vec_size_common(...)
 15. ├─EPATADA (local) extract_urls(.)
 16. │ ├─... %>% unlist() at test-URLChecker.R:4:5
 17. │ └─stringr::str_extract_all(text, "http[s]?://[^\\s\\)\\]]+")
 18. │   └─stringr:::check_lengths(string, pattern)
 19. │     └─vctrs::vec_size_common(...)
 20. ├─base::unlist(.)
 21. ├─base::unlist(.)
 22. └─purrr::map(files, ~readLines(.x))
 23.   └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
 24.     ├─purrr:::with_indexed_errors(...)
 25.     │ └─base::withCallingHandlers(...)
 26.     ├─purrr:::call_with_cleanup(...)
 27.     └─EPATADA (local) .f(.x[[i]], ...)
 28.       └─base::readLines(.x)
 29.         └─base::file(con, "r")

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test-ResultFlagsIndependent.R:67:3'): No NA's in independent flag columns ──
<dplyr_error_join_incompatible_type/dplyr_error_join/dplyr_error/rlang_error/error/condition>
Error in `dplyr::full_join(., noncont.data, by = c(names(cont.data)))`: Can't join `x$TADA.ContinuousData.Flag` with `y$TADA.ContinuousData.Flag` due to incompatible types.
i `x$TADA.ContinuousData.Flag` is a <character>.
i `y$TADA.ContinuousData.Flag` is a <logical>.
Backtrace:1. ├─EPATADA::TADA_FlagContinuousData(testdat, clean = FALSE, flaggedonly = FALSE) at test-ResultFlagsIndependent.R:67:3
  2. │ └─cont.data %>% ... at EPATADA/R/ResultFlagsIndependent.R:305:3
  3. ├─dplyr::full_join(., noncont.data, by = c(names(cont.data)))
  4. ├─dplyr:::full_join.data.frame(., noncont.data, by = c(names(cont.data)))
  5. │ └─dplyr:::join_mutate(...)
  6. │   └─dplyr:::join_cast_common(x_key, y_key, vars, error_call = error_call)
  7. │     ├─rlang::try_fetch(...)
  8. │     │ └─base::withCallingHandlers(...)
  9. │     └─vctrs::vec_ptype2(x, y, x_arg = "", y_arg = "", call = error_call)
 10. ├─vctrs (local) `<fn>`()
 11. │ └─vctrs::vec_default_ptype2(...)
 12. │   ├─base::withRestarts(...)
 13. │   │ └─base (local) withOneRestart(expr, restarts[[1L]])
 14. │   │   └─base (local) doWithOneRestart(return(expr), restart)
 15. │   └─vctrs::stop_incompatible_type(...)
 16. │     └─vctrs:::stop_incompatible(...)
 17. │       └─vctrs:::stop_vctrs(...)
 18. │         └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)
 19. │           └─rlang:::signal_abort(cnd, .file)
 20. │             └─base::signalCondition(cnd)
 21. └─rlang (local) `<fn>`(`<vctrs__2>`)
 22.   └─handlers[[1L]](cnd)
 23.     └─dplyr:::rethrow_error_join_incompatible_type(cnd, vars, error_call)
 24.       └─dplyr:::stop_join(...)
 25.         └─dplyr:::stop_dplyr(...)
 26.           └─rlang::abort(...)

@cristinamullin
Copy link
Collaborator Author

Note: the continuous data bug is fixed in this PR. This issue was occurring when all data is flagged as continuous.

# ignore warning
# file("") only supports open = "w+" and open = "w+b": using the former
# https://github.com/USEPA/EPATADA/pull/548
suppressWarnings(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hillarymarler do you think it is it ok to suppress this warning?

... %>% ... at test-URLChecker.R:32:3
  2. ├─base::setdiff(., c("https://www.itecmembers.org/attains/", "https://attains.epa.gov/attains-public/api/assessmentUnits?assessmentUnitIdentifier="))
  3. │ └─base::as.vector(x)
  4. ├─base::unique(.)
  5. ├─EPATADA (local) clean_url(.)
  6. │ ├─... %>% stringr::str_remove_all("[<>]") at test-URLChecker.R:9:5
  7. │ └─stringr::str_remove_all(url, "[\\\\.,\\\")]+$|[{}].*")
  8. │   └─stringr::str_replace_all(string, pattern, "")
  9. │     └─stringr:::check_lengths(string, pattern, replacement)
 10. │       └─vctrs::vec_size_common(...)
 11. ├─stringr::str_remove_all(., "[<>]")
 12. │ └─stringr::str_replace_all(string, pattern, "")
 13. │   └─stringr:::check_lengths(string, pattern, replacement)
 14. │     └─vctrs::vec_size_common(...)
 15. ├─EPATADA (local) extract_urls(.)
 16. │ ├─... %>% unlist() at test-URLChecker.R:4:5
 17. │ └─stringr::str_extract_all(text, "http[s]?://[^\\s\\)\\]]+")
 18. │   └─stringr:::check_lengths(string, pattern)
 19. │     └─vctrs::vec_size_common(...)
 20. ├─base::unlist(.)
 21. ├─base::unlist(.)
 22. └─purrr::map(files, ~readLines(.x))
 23.   └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
 24.     ├─purrr:::with_indexed_errors(...)
 25.     │ └─base::withCallingHandlers(...)
 26.     ├─purrr:::call_with_cleanup(...)
 27.     └─EPATADA (local) .f(.x[[i]], ...)
 28.       └─base::readLines(.x)
 29.         └─base::file(con, "r")

@JamesBisese
Copy link
Collaborator

JamesBisese commented Nov 27, 2024

There is a bug where the .data$MeasureQualifierCode 'NA' values are created as 'logicals' rather than 'character'. I am struggling with that in the Upload dataset and Upload Progress File functionality. Data imported into TADAShiny via the Run Query is successful, but from either Upload path it crashing around the handling of the 'MeasureQualifierCode'.

I think it is caused by the values being 'logicals' using the Upload paths, but 'character' using the Run Query path.
Then the 'strsplit()' function crashes because there is no way to 'split' a logical.

Here is my console showing the difference

> TADAShiny::run_app()

Listening on http://127.0.0.1:4811
[1] "Loading user excel data file: C:\\Users\\JAMES~1.BIS\\AppData\\Local\\Temp\\RtmpekUeWE/05a406c9a22e3b45742b867f/0.xlsx"
Called from: EPATADA::TADA_AutoClean(tadat$uploaded_excel_df)
Browse[1]> typeof(.data$MeasureQualifierCode)
[1] "logical"
Browse[1]> Q
> TADAShiny::run_app()

Listening on http://127.0.0.1:4811
[1] "Downloading WQP query results. This may take some time depending upon the query size."
$statecode
[1] "US:01"

$startDate
[1] "2024-09-26"

$sampleMedia
[1] "Water" "water"

$providers
[1] "STORET"

$endDate
[1] "2024-10-26"

GET: https://www.waterqualitydata.us/data/Result/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&dataProfile=resultPhysChem&mimeType=csv
NEWS: Data does not include USGS data newer than March 11, 2024. More details:                                                                                              
https://doi-usgs.github.io/dataRetrieval/articles/Status.html
GET: https://www.waterqualitydata.us/data/Station/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&mimeType=csv
GET: https://www.waterqualitydata.us/data/Project/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&mimeType=csv
NEWS: Data does not include USGS data newer than March 11, 2024. More details:                                                                                              
https://doi-usgs.github.io/dataRetrieval/articles/Status.html
[1] "Data successfully downloaded. Running TADA_AutoClean function."
Called from: TADA_AutoClean(TADAprofile)
Browse[1]> typeof(.data$MeasureQualifierCode)
[1] "character"
Browse[1]> 

@cristinamullin cristinamullin merged commit 1ee2452 into develop Dec 2, 2024
7 checks passed
@cristinamullin cristinamullin deleted the vignette-updates branch December 2, 2024 17:43
@cristinamullin
Copy link
Collaborator Author

There is a bug where the .data$MeasureQualifierCode 'NA' values are created as 'logicals' rather than 'character'. I am struggling with that in the Upload dataset and Upload Progress File functionality. Data imported into TADAShiny via the Run Query is successful, but from either Upload path it crashing around the handling of the 'MeasureQualifierCode'.

I think it is caused by the values being 'logicals' using the Upload paths, but 'character' using the Run Query path. Then the 'strsplit()' function crashes because there is no way to 'split' a logical.

Here is my console showing the difference

> TADAShiny::run_app()

Listening on http://127.0.0.1:4811
[1] "Loading user excel data file: C:\\Users\\JAMES~1.BIS\\AppData\\Local\\Temp\\RtmpekUeWE/05a406c9a22e3b45742b867f/0.xlsx"
Called from: EPATADA::TADA_AutoClean(tadat$uploaded_excel_df)
Browse[1]> typeof(.data$MeasureQualifierCode)
[1] "logical"
Browse[1]> Q
> TADAShiny::run_app()

Listening on http://127.0.0.1:4811
[1] "Downloading WQP query results. This may take some time depending upon the query size."
$statecode
[1] "US:01"

$startDate
[1] "2024-09-26"

$sampleMedia
[1] "Water" "water"

$providers
[1] "STORET"

$endDate
[1] "2024-10-26"

GET: https://www.waterqualitydata.us/data/Result/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&dataProfile=resultPhysChem&mimeType=csv
NEWS: Data does not include USGS data newer than March 11, 2024. More details:                                                                                              
https://doi-usgs.github.io/dataRetrieval/articles/Status.html
GET: https://www.waterqualitydata.us/data/Station/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&mimeType=csv
GET: https://www.waterqualitydata.us/data/Project/search?statecode=US%3A01&startDateLo=09-26-2024&sampleMedia=Water%3Bwater&providers=STORET&startDateHi=10-26-2024&mimeType=csv
NEWS: Data does not include USGS data newer than March 11, 2024. More details:                                                                                              
https://doi-usgs.github.io/dataRetrieval/articles/Status.html
[1] "Data successfully downloaded. Running TADA_AutoClean function."
Called from: TADA_AutoClean(TADAprofile)
Browse[1]> typeof(.data$MeasureQualifierCode)
[1] "character"
Browse[1]> 

Thanks Jimmy. That sounds like the same bug that I just addressed in this update. I also just made a few other updates to TADAShiny (USEPA/TADAShiny@0ce05b4). Let me know if you still run into this issue after you update the TADA package and TADA Shiny app.

@cristinamullin cristinamullin restored the vignette-updates branch December 2, 2024 19:04
@cristinamullin cristinamullin deleted the vignette-updates branch December 2, 2024 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants