Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking LTRs and internal regions for ERV: Further postprocessing necessary? #288

Open
osthomas opened this issue Oct 1, 2024 · 0 comments
Labels

Comments

@osthomas
Copy link

osthomas commented Oct 1, 2024

Dear all,

I am looking into ERVs in the mouse genome (GRCm39), and I am a bit confused about relevant post processing steps.

There is a script available to combine ERV LTRs with internal regions, based on the names of the elements (https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13). However, I am not sure if this is (still?) required, or if ProcessRepeats does this already.

Here is one example from the .out file in which LTRs and internal regions were linked already via the ID column:

  532    6.9  8.9  4.2  1             3122749   3123277 (192031002) + ERVB4_2-LTR_MM   LTR/ERVK                     1    553     (0)      98      
  689    5.8  9.9  0.8  1             3123278   3124349 (192029930) + ERVB4_2-I_MM     LTR/ERVK                     1   1972  (6402)      98      
  162    3.5  0.0  0.0  1             3124347   3124487 (192029792) + ERVB4_2-I_MM     LTR/ERVK                  5825   5965  (2409)      98 *    
 2116    6.8  5.8  0.8  1             3124478   3126963 (192027316) + RLTR45-int       LTR/ERVK                  1318   3204  (4040)      99      
  626    2.6  2.9  2.2  1             3126958   3127550 (192026729) + RLTR45-int       LTR/ERVK                  3390   3986  (3258)      99 *    
 2455    3.3  0.6  0.2  1             3127544   3129715 (192024564) + RLTR45-int       LTR/ERVK                  5065   7244     (0)      99      
  544    6.3  8.7  4.5  1             3129716   3130247 (192024032) + ERVB4_2-LTR_MM   LTR/ERVK                     1    553     (0)      99      

In this particular case, joining by name would not even catch it.

ProcessRepeats seems to do something with LTRs/ints. Are there cases that ProcessRepeats might miss, which may benefit from further parsing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant