Refactor: kestrel_pre_result.tsv
to Match kestrel_result.tsv
Columns
#72
Labels
kestrel_pre_result.tsv
to Match kestrel_result.tsv
Columns
#72
Currently,
kestrel_pre_result.tsv
lacks several important columns that appear inkestrel_result.tsv
, such asDepth_Score
,Confidence
,Motif_fasta
,Frame_Score
,Motif_left
,Motif_right
, andPOS_fasta
. This discrepancy makes the output less informative and more difficult to merge or compare with final Kestrel results.This issue proposes refactoring how
kestrel_pre_result.tsv
is generated so it includes the same columns askestrel_result.tsv
—or at least a superset that covers all relevant fields (e.g.,Depth_Score
,Confidence
,Motifs
, etc.). By doing so, it will streamline downstream analysis (such as cohort mode or rare allele filtering).Motivation
Consistency:
kestrel_pre_result.tsv
andkestrel_result.tsv
allows for direct comparison and merging without additional transformations or complicated parsing.Completeness of Information:
Depth_Score
andConfidence
offer critical insights into variant quality and precision, which are missing in the current “pre” result.Downstream Integrations:
Proposed Changes
Add Extra Columns to
kestrel_pre_result.tsv
:Depth_Score
Confidence
Motif_fasta
(if relevant in the pre-processing step)Frame_Score
Motif_left
Motif_right
POS_fasta
Maintain Existing Columns:
Motifs
,POS
,REF
,ALT
,Sample
,Variant
, and the depth columns) to avoid breaking existing logic.Adjust the Generation Logic:
kestrel_pre_result.tsv
is produced (e.g., in the Kestrel postprocessing or intermediate pipeline steps) so that it includes these columns if the data is available at that stage.Check for Data Availability:
Frame_Score
) might be computed only in later steps. If feasible, we should compute or carry it forward earlier.Motif_left
,Motif_right
) are not yet computed in the “pre” stage, consider either computing them or marking them with placeholders.Documentation & Testing:
kestrel_pre_result.tsv
.Implementation Outline
kestrel_pre_result.tsv
.Depth_Score
,Confidence
,Frame_Score
, etc. from the final Kestrel result stage if feasible.Motif_left
/Motif_right
are computed if relevant in the “pre” stage.kestrel_pre_result.tsv
columns with those inkestrel_result.tsv
to confirm consistency.Benefits
kestrel_pre_result.tsv
can now seamlessly handle the same columns used in final results, improving workflow consistency.Depth_Score
orConfidence
in pre-results helps debug or track variant filtering earlier in the pipeline.Conclusion
By aligning
kestrel_pre_result.tsv
withkestrel_result.tsv
, we ensure a consistent, flexible pipeline output that’s easier to integrate, debug, and extend for advanced analyses.The text was updated successfully, but these errors were encountered: