Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No MCC values in the filter_param plot #32

Open
ashishjain1988 opened this issue Feb 12, 2024 · 9 comments
Open

No MCC values in the filter_param plot #32

ashishjain1988 opened this issue Feb 12, 2024 · 9 comments

Comments

@ashishjain1988
Copy link

Hi,

I am trying using the filter_params function to select the optimum A.min values for filtering. We are interested in contacts on chromomse 4. When I check the plot, it seems to not have the MCC values for A values (approx from 2 to 8). Is there an reason for the package to not able to calculate the MCC values? Here is the plot that I got for chromosome 4.

Screenshot 2024-01-23 at 11 30 52 AM
@mdozmorov
Copy link
Contributor

mdozmorov commented Feb 13, 2024

Hi @ashishjain1988 , it is hard to tell why some MCC values are missing. I won't be concerned about it. More important is to find acceptable True and Falce positive rate cutoffs. I'd be conservative and pick 10 but 7-8 is also OK. We already discussed that HiCcompare is robust to the choice of A #29 (comment) because small differences are unlikely to be detected as statistically significant. I'll keep an eye on missing MCC values and debug when have an example.

@ashishjain1988
Copy link
Author

Hi @mdozmorov , thank you for your response. This data is more deeply sequenced than the previois one. One thing I want to ask is the TPR and FPR. Based on this plot it seems like the False Positive rate is way higher than the true positive rate at A.min=10. Is still that a good threshold? Also, the default threshold of 2 is not giving us any significant contacts.

@mdozmorov
Copy link
Contributor

I overlooked the curves are inverted, this is indeed confusing. Here's the explanation from my student, @hamy12398:

Their plot can happen since it can depend on number of changed they set. (ex above, I set numberChanges to 30). Since MCC is based from products of different sum pairs of TP, TN, FP, FN in their denominator in their fraction function, so by some chance if this denominator = 0, it can cause MCC to be undefined.
image

What are the parameters you used for filter_params()? Can you try with numChanges = 30?

@ashishjain1988
Copy link
Author

I was actually carrying out the analysis using 25kbp resolution and as mentioned in the manual i proportionally increased the numChanges to 2500 (filter_params(hic.list[[i]],numChanges = 2500)). Is that too much for 25kbp resolution? I will try out the numChanges = 30 too. Thanks!

@ashishjain1988
Copy link
Author

Below is the plot I got using the filter_params function for chromosome 4. The resolution I used is 25kbp and numChanges = 30. It seems like the all the results are FPR
image

@mdozmorov
Copy link
Contributor

It is hard to tell without seeing the data. Have you tried to visualize single matrices? It may be the data is very sparse at 25k resolution.

@ashishjain1988
Copy link
Author

This is how the contact data looks like for individual samples. The scale is log2.
image
image

@mdozmorov
Copy link
Contributor

The data looks good. I still cannot say why your A plot looks strange. Try debugging of the actual function. Again, A threshold is not that critical, I would explore the MD plot, call differential interactions and visualize them.

@ashishjain1988
Copy link
Author

ashishjain1988 commented Feb 20, 2024

Thanks! I will look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants