-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 308: Create workflow to add a note to all New Bookshelf Items on a regular basis. #311
Conversation
…n a regular basis. This creates a Workflow named `New Bookshelf Items Note` that runs on a Cron Job interval at 7:00am each day. This utilizes approximately day old data from LDP in order to perform SQL queries that are not allowed by the CQL engine (using a granular filter). These SQL queries allow for a more granular fetch of the desired rows to reduce the amount of unnecessary Items being processed. This stale data must be then re-fetch from the appropriate FOLIO end point to get a fresh copy (given that the item data is a day old, at least). Then this Item data must be validated a second time. The filtering is as follows: - The Item must be in the 'Checked out' status. - The Item must be under 'Evans nbs'. - The Item must not already have a 'Check in' note of 'Place on New Bookshelf'. The original design of this work was to conditionally check for whether the filter is still valid after fetching a fresh copy from the FOLIO endpoint. The filter check is then used to conditionally determine if the PUT update should happen or not. There have been difficulties where this would not build and there is no clear indication of why the build is failing. A decision was made to not lose any more time investigating this and the conditional logic has been removed. The code in this commit always performs the PUT update even if unnecessary because the conditional logic has been removed. Ideally, this problem should be solved in some future commit and the conditionally logic should be restored. The original problematic code can be found via this commit hash 2754771. Additional problems regarding the `logLevel`. Errors are being thrown about `logLevel` not being defined (despite explicitly defining it in the fw configuration settings). The conditional checks on the logging are removed until such time that this problem can be identified and resolved. Once resolved the `logLevel` checks should be restored. The logic in this Workflow utilizes declaring the Item (with a variable name of `item`) being looped over and updated using a different variable called `updatedItem`. This might be unnecessary but it feels safer to not modify the variable used during the looping process of the Workflow subprocess. The `itemId` variable is defined because problems were observed when trying to use the `item` as both a JSON payload data object and as a Java Object to perform `item.id`. Using both of these in the request (`225fbb68-b311-4623-b397-856357be1c90` "Request Item") ended up being a problem. The `item` is therefore now used only as JSON data (as `updatedItem`) and the `item.id` is extracted out into the `itemId` process variable. Warnings were observed regarding the Response from the `0d236b3c-e571-4b37-8114-7570c9d496a4` "Update Item" FOLIO endpoint PUT request having an empty response. To suppress this warning, the `outputVariable` was removed and replaced with just an empty object `{}`. This had the ironic effect of presenting a new warning about the `outputVariable` being an empty object. Either way, we get a warning, so this is left as-is as an empty object. The default `historyTimeToLive` of `90` is used because that is common. No decision on what a good or bad history TTL is made. In order to keep the design simple for this first pass, some values that should otherwise not be hard-coded are hard-coded. These are: - Effective Location UUID of `07a4e97e-9941-4f60-ad25-577bb6672c08`. - The message text for a new bookshelf of `Place on New Bookshelf`. To further keep this first pass simple, there is no limit and therefore no looping performed over the total number of items. This could become considerably large and in such cases additiona `COUNT` queries with a loop over that from the LDP would be needed. Furthermore, this performs the update sub-process loop in a serial manner rather than a parallel manner. The expanded SQL query used here for selecting from LDP is: ```sql ( select id from inventory_items where effective_location_id = '07a4e97e-9941-4f60-ad25-577bb6672c08' and status__name = 'Checked out' and json_array_length(data->'circulationNotes') = 0 ) union ( select s.id from ( select id, json_array_elements(data->'circulationNotes')->>'note' as note from inventory_items where effective_location_id = '07a4e97e-9941-4f60-ad25-577bb6672c08' and status__name = 'Checked out' ) s where 'Place on New Bookshelf' <> any (string_to_array(s.note, ';')) ); ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
The use of var in javascript isn't recommended though.
See:
- nbs-items-note/nodes/js/buildItems.js
-- Line 1 - nbs-items-note/nodes/js/processItem.js
-- Line 1, 5, 14
The Javascript engine used here is a limited/out-of-date/modified one. |
…rd cron. The first number is not minute, it is second. This caused the cron job to run on the 7th minute of every hour rather at 7 am on every day. Note: The last value apparently must be a `?` and not `*` or mod-camunda will fail on: ``` java.text.ParseException: Support for specifying both a day-of-week AND a day-of-month parameter is not implemented. ```
The previous SQL utilized `any()`. This is apparently a mistake. The `any()` will select anything that matches. In the case where there are two things in the array, one that matches and one that doesn't, then the `any()` will match. This is not the correct behavior. The new approach is to use the `jsonb` contains operator `<@`. The LDP uses `json` and not `jsonb` like Folio. The Postgresql decided not to provide support for the contains operator for `json`. The solution here is to cast the `circulationNotes` array into a `jsonb` structure. This likely has a performance impact but it is after the other filters are selected. This should keep the cost down. The `circulationNotes` will likely normally be a small array. Note: The syntax `<@` was attempted but the Camunda Freemarker syntax tries to interpret this. Change the `<@` into `\l@` then causes the JSON to have an escape problem for the `\`. Change the `\l@` into `\\l@` and somehow the should be resulting `\l` does not get processed correctly by freemarker and it errors on the slash. The solution is to instead swap the left and right hand sides using instead `@>`. This is the SQL query: ```sql ( select id from inventory_items where effective_location_id = '07a4e97e-9941-4f60-ad25-577bb6672c08' and status__name = 'Checked out' and json_array_length(data->'circulationNotes') = 0 ) union ( select s.id from ( select id, to_jsonb(data->'circulationNotes') as notes from inventory_items where effective_location_id = '07a4e97e-9941-4f60-ad25-577bb6672c08' and status__name = 'Checked out' ) s where not s.notes @> '[{"note":"Place on New Bookshelf"}]'::jsonb ) ```
…sing multiple insertions of the note. The note is being inserted multiple times because the existence checks are missing the `[i]` (index). These incorrect checks always fail and so therefore the note is always added. Add the missing index to the array so that duplicate notes do not appear.
The goal is to be in approximately 7:00am Central Time. At -5, this would be 7:00am in Central. At -6, this would be 6:00am in Central. Both cases are safely before the usual 8:00am Central Time start of the day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolves #308
This creates a Workflow named
New Bookshelf Items Note
that runs on a Cron Job interval at 7:00am each day.This utilizes approximately day old data from LDP in order to perform SQL queries that are not allowed by the CQL engine (using a granular filter). These SQL queries allow for a more granular fetch of the desired rows to reduce the amount of unnecessary Items being processed. This stale data must be then re-fetch from the appropriate FOLIO end point to get a fresh copy (given that the item data is a day old, at least). Then this Item data must be validated a second time.
The filtering is as follows:
The original design of this work was to conditionally check for whether the filter is still valid after fetching a fresh copy from the FOLIO endpoint. The filter check is then used to conditionally determine if the PUT update should happen or not. There have been difficulties where this would not build and there is no clear indication of why the build is failing. A decision was made to not lose any more time investigating this and the conditional logic has been removed. The code in this commit always performs the PUT update even if unnecessary because the conditional logic has been removed. Ideally, this problem should be solved in some future commit and the conditionally logic should be restored.
The original problematic code can be found via this commit hash 2754771.
Additional problems regarding the
logLevel
.Errors are being thrown about
logLevel
not being defined (despite explicitly defining it in the fw configuration settings). The conditional checks on the logging are removed until such time that this problem can be identified and resolved. Once resolved thelogLevel
checks should be restored.The logic in this Workflow utilizes declaring the Item (with a variable name of
item
) being looped over and updated using a different variable calledupdatedItem
. This might be unnecessary but it feels safer to not modify the variable used during the looping process of the Workflow subprocess.The
itemId
variable is defined because problems were observed when trying to use theitem
as both a JSON payload data object and as a Java Object to performitem.id
. Using both of these in the request (225fbb68-b311-4623-b397-856357be1c90
"Request Item") ended up being a problem. Theitem
is therefore now used only as JSON data (asupdatedItem
) and theitem.id
is extracted out into theitemId
process variable.Warnings were observed regarding the Response from the
0d236b3c-e571-4b37-8114-7570c9d496a4
"Update Item" FOLIO endpoint PUT request having an empty response. To suppress this warning, theoutputVariable
was removed and replaced with just an empty object{}
. This had the ironic effect of presenting a new warning about theoutputVariable
being an empty object. Either way, we get a warning, so this is left as-is as an empty object.The default
historyTimeToLive
of90
is used because that is common. No decision on what a good or bad history TTL is made.In order to keep the design simple for this first pass, some values that should otherwise not be hard-coded are hard-coded. These are:
07a4e97e-9941-4f60-ad25-577bb6672c08
.Place on New Bookshelf
.To further keep this first pass simple, there is no limit and therefore no looping performed over the total number of items. This could become considerably large and in such cases additiona
COUNT
queries with a loop over that from the LDP would be needed. Furthermore, this performs the update sub-process loop in a serial manner rather than a parallel manner.The start time cron jobs are in UTC.
The goal is to be in approximately 7:00am Central Time, which is 12 in UTC.
At
-5
, this 12 UTC would be 7:00am in Central.At
-6
, this 12 UTC would be 6:00am in Central.Both cases are safely before the usual 8:00am Central Time start of the day.
The LDP uses
json
and notjsonb
like Folio.The Postgresql decided not to provide support for the contains operator for
json
.The solution here is to cast the
circulationNotes
array into ajsonb
structure.This likely has a performance impact but it is after the other filters are selected.
This should keep the cost down.
The
circulationNotes
will likely normally be a small array.Note: The syntax
<@
was attempted but the Camunda Freemarker syntax tries to interpret this.Change the
<@
into\l@
then causes the JSON to have an escape problem for the\
.Change the
\l@
into\\l@
and somehow the should be resulting\l
does not get processed correctly by freemarker and it errors on the slash.The solution is to instead swap the left and right hand sides using instead
@>
.The expanded SQL query used here for selecting from LDP is: