cafetutorial_clade_and_size_filter.py error #64

NINGCHINA · 2019-07-17T08:35:37Z

Hi ! When I check cafetutorial_clade_and_size_filter.py at https://iu.app.box.com/v/cafetutorial-files/folder/22161186238?page=1 , I found the script will write out wrong gene families with only one gene copy among all species, which is not suitable for gene families analysing. The mistake stays between line of 104 and 105. maybe it just I could not fully understand it.
codes looks like:
elif line_n not in lines_to_separate_set and len(lines_to_keep_set) == 0:
output_file.write(line)

The text was updated successfully, but these errors were encountered:

gwct · 2019-07-17T14:31:01Z

I'm not sure about that code snippet, but I think using single-copy groups (one gene in all species) in these analyses is necessary. These groups are still informative when estimating rates of gene gain and loss since they do tell us something about the amount of change over time (in this case likely no change). So for estimating lambda these are useful, and then for ancestral reconstructions its likely all ancestral states will be inferred as 1, so they can just be ignored (unless they are your family of interest). Likewise for groups that contain 2 copies in all species, or 3 copies in all species, etc. Does that make sense?

…

-Gregg

On Wed, Jul 17, 2019 at 2:35 AM zhang ning ***@***.***> wrote: Hi ! When I check cafetutorial_clade_and_size_filter.py at https://iu.app.box.com/v/cafetutorial-files/folder/22161186238?page=1 , I found the script will write out wrong gene families with only one gene copy among all species, which is not suitable for gene families analysing. The mistake stays between line of 104 and 105. maybe it just I could not fully understand it. codes looks like: elif line_n not in lines_to_separate_set and len(lines_to_keep_set) == 0: output_file.write(line) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#64?email_source=notifications&email_token=AC7RJCIFPWL3RLZT3HQNRVLP73KVTA5CNFSM4IEN33IKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G7VJFTA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AC7RJCKIAMHNWJWVQ57PQWLP73KVTANCNFSM4IEN33IA> .

NINGCHINA · 2019-07-17T15:36:55Z

thanks gwct ! as gene families can be divided as 3 groups. 1: large_families with more than 100 gene copies are found in one or more species.2:filtered_families with more than 1 gene copies are found in more than 2 species in any clade or in all species. 3: gene families with less than 1 gene copy in all species. however, I think it is kind of weird at line of 104 and 105. then when I checked the results in the tutorial "large_filtered_cafe_input.txt" ,"filtered_cafe_input.txt " and "unfiltered_cafe_input.txt". the code may be wrong.....

gwct · 2019-07-17T22:40:10Z

I'm not sure about category 2. I think that should be families with *1 or more* gene copies in *more than one clade*. But I'm not sure about it in the context of this script. I think someone who helped write this script will have to weigh in.

…

-Gregg

On Wed, Jul 17, 2019 at 9:36 AM zhang ning ***@***.***> wrote: thanks gwct ! as gene families can be divided as 3 groups. 1: large_families with more than 100 gene copies are found in one or more species.2:filtered_families with more than 1 gene copies are found in more than 2 species in any clade or in all species. 3: gene families with less than 1 gene copy in all species. however, I think it is kind of weird at line of 104 and 105. then when I checked the results in the tutorial "large_filtered_cafe_input.txt" ,"filtered_cafe_input.txt " and "unfiltered_cafe_input.txt". the code may be wrong..... — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#64?email_source=notifications&email_token=AC7RJCIF64QAR4MXMDN6DWTP744BRA5CNFSM4IEN33IKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2EZTYI#issuecomment-512334305>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AC7RJCOLW57LXX2ZNLKKH4TP744BRANCNFSM4IEN33IA> .

NINGCHINA · 2019-07-18T01:01:49Z

@gwct very much thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cafetutorial_clade_and_size_filter.py error #64

cafetutorial_clade_and_size_filter.py error #64

NINGCHINA commented Jul 17, 2019

gwct commented Jul 17, 2019 via email

NINGCHINA commented Jul 17, 2019

gwct commented Jul 17, 2019 via email

NINGCHINA commented Jul 18, 2019

cafetutorial_clade_and_size_filter.py error #64

cafetutorial_clade_and_size_filter.py error #64

Comments

NINGCHINA commented Jul 17, 2019

gwct commented Jul 17, 2019 via email

NINGCHINA commented Jul 17, 2019

gwct commented Jul 17, 2019 via email

NINGCHINA commented Jul 18, 2019