4911 Introduced fix_rd_broken_links command #5095
Open
+403
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces the Django command outlined in #4911 to fix RDs affected by the broken links issue.
date_modified
equal to or greater than--start-date
, which should be the date we started our last full re-index. This ensures we fix all broken links that may have been scrambled.Docket
table, wheredate_modified
has an index, whereas theDocketEvent
table does not. As a result, this query returns the docket IDs that changed after this date.DocketEvent
query to retrieve the counts of PGH events for each of these docket IDs. For this, the columnpgh_obj_id
is used instead ofid
for performance reasons, as the first is an indexed column.DocketEvent
entries for thedocket_id
, using the current docketslug
as a filter. The idea behind this is that if the total number of events for a docket is equal to the total number of events filtered by the current docketslug
, then all events that belong to the docket have the sameslug
, meaning that theslug
hasn't changed.slug
has changed, thedocket_id
is scheduled for re-indexing. This includes re-indexing all associatedRECAPDocuments
in bulk so they get fixed.The command runs in a two-step process:
The command can be executed as follows:
manage.py fix_rd_broken_links --start-date 2024-03-25 --queue celery --chunk-size 50
March 25, 2024 is the approximate date when the last full re-index for RECAP was started, according to Slack records.
This command should be run after #5086 is merged.