DPL-471 - Fix historical data for malfomed root sample ids to support important samples algorithim (C=?, V=4) #502
Labels
Enhancement
New feature or request
GSU
Delivers work for the GSU unit
Heron
Heron
priority samples
RVI
RVI Project - Sequence old Heron samples for respiratory diseases
User story
There are malformed root_sample_ids found in 6 different database tables across various systems.
These generally take the format
ABC123456789_YZ01
. The correct root_sample_id is the code before the underscore.Who are the primary contacts for this story
Jonnie B
Alan K
Acceptance criteria
Progress
Some of the
root_sample_id
data has been picked/sequenced; some of the IDs have been corrected and duplicated so the correct version of the data is in the database alongside the malformed version. For some IDs (94 of them), both of these things are true. Separate tickets have been made for different groups of data based on these characteristics.Here is a table which shows how the data is divided up:
The 4,699 rows of data which are not duplicated and were not picked/sequenced can simply be corrected. This is covered in DPL-048-2.
The 18,138 rows of data which were duplicated but never picked/sequenced can potentially be deleted or just marked as withdrawn. This is covered in DPL-048-3.
The 1,128 rows of data which are not duplicated but were picked/sequenced need to be corrected in many places (MLWH, SequenceScape, Event Warehouse, Mongo). This is covered in DPL-048-4.
The 94 rows of data which were duplicated but also picked/sequenced can't be deleted since the data has propagated a long way. It also can't just be updated since the correct data is also there and there would be duplicates. This is a complicated issue and is covered in DPL-048-5.
The text was updated successfully, but these errors were encountered: