-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run LASD historical scrape #324
base: master
Are you sure you want to change the base?
Conversation
scraper$validate_extract() | ||
# scraper$validate_extract() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting out the validate here to avoid dropping columns we want to keep!
if(lubridate::wday(current_date) %in% c(1, 2, 4, 6)){ | ||
cat("On date", as.character(current_date), "\n") | ||
# if(lubridate::wday(current_date) %in% c(1, 2, 4, 6)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this to pull ALL days (regardless of day-of-the-week).
scraper$save_raw() | ||
scraper$restruct_raw() | ||
scraper$extract_from_raw() | ||
# scraper$validate_extract() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing as above – don't want to validate to avoid dropping columns.
# 3. EXTRACT TABLES | ||
# -------------------------------------------------------------------------- | ||
|
||
ex_ <- ExtractTable(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we use ExtractTable
here (which we don't do in the main scraper). I couldn't find a better way to pull the data from the tables with any reliability unfortunately. Obviously ExtractTable
is cheap but not free (and reimbursement is still unclear).
Opening this up as a draft PR since it should NOT be merged to the main branch!! – I just needed a place to add comments/instructions. Normal command line syntax to run (below):
To run redo-scrape:
To run WBM scrape:
Steps to take after scraping but before handing off to volunteers to minimize manual cleaning/entry: