a general (common) association package for two (larger) sorted genome location files
- flexible function (a special class of map-reduce framework)
- python paradigm and design pattern applied
- a basic usage (without install):
python3 ga/association.py <test/1.vcf> <test/2.vcf> [associated_output] [left_outer_output] [right_outer_output]
see ga/association.py for details.
- install the package
python3 setup.py install [--prefix PATH dir]
then call the command thereafter
ga <test/1.vcf> <test/2.vcf>
[associated_out] (association.out for default) keeps all associated (overlap) records.
[left_outer_output] keeps left non-associated records.
[right_outer_output] keeps right non-associated records.
In all most cases, you need only add a record keep rule callable (e.g. new_add_compare
) for two (associated) under-compared records in particular application.
Then call the iterative_overlap_block
function's record_keep_rule
parameter with new_add_compare
.
For some depth improvements, you would rewrite the record_keep_rule
and paired_unit_rules
simultaneously for adaptation.