Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FDR calculation does not work when I lower the mass range #42

Open
LiaSerrano opened this issue Sep 26, 2022 · 7 comments
Open

FDR calculation does not work when I lower the mass range #42

LiaSerrano opened this issue Sep 26, 2022 · 7 comments

Comments

@LiaSerrano
Copy link

Hello,

I noticed a pattern in when the FDR calculation works and does not. When I drop the lower ms2 m/z range to 150 from 300, I get the error shown below and the resulting FDR outputs are blank. This was replicated with three different pairs of rawfiles where the only difference was the lower mass. Is there an obvious reason for this?

Thanks!
Lia

data = group_nodes_with_same_edge(data)
File "C:\Users\lrserrano\Anaconda3\envs\csod\lib\site-packages\csodiaq\idpicker.py", line 23, in group_nodes_with_same_edge
if first: l1, l2 = map(list,zip(*data))
ValueError: not enough values to unpack (expected 2, got 0)

@jessegmeyerlab
Copy link
Member

jessegmeyerlab commented Sep 26, 2022

Thanks Lia,

It looks like this error keeps coming up in different scenarios. Based on the traceback it appears to be a problem in protein inference. Without investigating I suspect this could be coming up in cases where there are no significant proteins to group. For example, when you drop the mz range maybe you get more decoy hits and now they are in the top 100 proteins so there are none significant below 1% fdr. Does that seem possible? How many protein hits did you have before expanding the fragment range?

@CCranney would you have time to help us investigate this error? I think there is also a second issue open with this same error

@LiaSerrano
Copy link
Author

Hi @jgmeyerucsd

I just checked on that-- yes, the results with the lower mz range have ~1K more decoys

@LiaSerrano
Copy link
Author

how can this be explained when there is only 1 decoy hit in the unfiltered output?

@jessegmeyerlab
Copy link
Member

jessegmeyerlab commented Oct 17, 2022 via email

@LiaSerrano
Copy link
Author

no it is not

@LiaSerrano
Copy link
Author

Is there a way to see most likely protein ID from idPicker without the protein FDR filter applied, but rather just from the 1% peptide FDR list?

@jessegmeyerlab
Copy link
Member

jessegmeyerlab commented Oct 17, 2022

Since I don't think we require a FASTA input, I believe the way this works is that it looks back at your spectral library to get the protein assignment. Maybe the format of your protein names in your spectral library file is different than the names used in our example human.tsv traml and that is confusing the protein grouping code? Worst case you could do this with a script in R or python manually by loading the spectral library and doing a lookup from the peptide hits.

@CCranney wrote the code for this and has since left the lab to start his MS degree. We are having troubling understanding his implementation because there are not many comments. If he does not have time to look at this unfortunately it will likely be a few months before we have hired more people to help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants