-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cafetutorial_clade_and_size_filter.py error #64
Comments
I'm not sure about that code snippet, but I think using single-copy groups
(one gene in all species) in these analyses is necessary. These groups are
still informative when estimating rates of gene gain and loss since they do
tell us something about the amount of change over time (in this case likely
no change). So for estimating lambda these are useful, and then for
ancestral reconstructions its likely all ancestral states will be inferred
as 1, so they can just be ignored (unless they are your family of
interest). Likewise for groups that contain 2 copies in all species, or 3
copies in all species, etc. Does that make sense?
…-Gregg
On Wed, Jul 17, 2019 at 2:35 AM zhang ning ***@***.***> wrote:
Hi ! When I check cafetutorial_clade_and_size_filter.py at
https://iu.app.box.com/v/cafetutorial-files/folder/22161186238?page=1 , I
found the script will write out wrong gene families with only one gene copy
among all species, which is not suitable for gene families analysing. The
mistake stays between line of 104 and 105. maybe it just I could not fully
understand it.
codes looks like:
elif line_n not in lines_to_separate_set and len(lines_to_keep_set) == 0:
output_file.write(line)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#64?email_source=notifications&email_token=AC7RJCIFPWL3RLZT3HQNRVLP73KVTA5CNFSM4IEN33IKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G7VJFTA>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC7RJCKIAMHNWJWVQ57PQWLP73KVTANCNFSM4IEN33IA>
.
|
thanks gwct ! as gene families can be divided as 3 groups. 1: large_families with more than 100 gene copies are found in one or more species.2:filtered_families with more than 1 gene copies are found in more than 2 species in any clade or in all species. 3: gene families with less than 1 gene copy in all species. however, I think it is kind of weird at line of 104 and 105. then when I checked the results in the tutorial "large_filtered_cafe_input.txt" ,"filtered_cafe_input.txt " and "unfiltered_cafe_input.txt". the code may be wrong..... |
I'm not sure about category 2. I think that should be families with *1 or
more* gene copies in *more than one clade*. But I'm not sure about it in
the context of this script. I think someone who helped write this script
will have to weigh in.
…-Gregg
On Wed, Jul 17, 2019 at 9:36 AM zhang ning ***@***.***> wrote:
thanks gwct ! as gene families can be divided as 3 groups. 1:
large_families with more than 100 gene copies are found in one or more
species.2:filtered_families with more than 1 gene copies are found in more
than 2 species in any clade or in all species. 3: gene families with less
than 1 gene copy in all species. however, I think it is kind of weird at
line of 104 and 105. then when I checked the results in the tutorial
"large_filtered_cafe_input.txt" ,"filtered_cafe_input.txt " and
"unfiltered_cafe_input.txt". the code may be wrong.....
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#64?email_source=notifications&email_token=AC7RJCIF64QAR4MXMDN6DWTP744BRA5CNFSM4IEN33IKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2EZTYI#issuecomment-512334305>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC7RJCOLW57LXX2ZNLKKH4TP744BRANCNFSM4IEN33IA>
.
|
@gwct very much thanks! |
Hi ! When I check cafetutorial_clade_and_size_filter.py at https://iu.app.box.com/v/cafetutorial-files/folder/22161186238?page=1 , I found the script will write out wrong gene families with only one gene copy among all species, which is not suitable for gene families analysing. The mistake stays between line of 104 and 105. maybe it just I could not fully understand it.
codes looks like:
elif line_n not in lines_to_separate_set and len(lines_to_keep_set) == 0:
output_file.write(line)
The text was updated successfully, but these errors were encountered: