You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been encountering an issue when running the scmap-cluster pipeline and would like to ask you for some input on this matter. It seems that independent runs result in variable and inconsistent assignments. There seem to be two main sets of results, one of which seems fair based on prior knowledge about the composition of the query dataset. The second set of results comprises assignments that seem to reflect an imperfect match rather than a complete mess. While it might still be to some degree possible that the assignment that I think is problematic is actually correct, the issue remains of the variability in the mapping results. I should stress that the "wrong" mapping occurs much more frequently than the one I feel would be correct, if I run scmap from scratch repeatedly, suggesting it may be right after all. Still, the variability troubles me.
I have possibly narrowed down the steps resulting in variable outcomes to the reference normalization (see script below). I tried increasing the number of selected features thinking that this may mitigate the impact of the previous steps, but that didn´t seem to be the case.
I would be happy to receive some suggestions from your side.
The text was updated successfully, but these errors were encountered:
finjen
changed the title
Issue with scmap-cluster function output
Issue with scmap-cluster function output/ Variable inconsistent assignment
Feb 25, 2021
Hi, not sure that the steps in your script are stochastic... Also they do not include any scmap functions.
Regarding scmap-cluster function, if I remember correctly it is stochastic and therefore it will indeed give you different results for each run. One way to get a stable result is to average different cell assignments by taking the most frequent one after multiple runs.
Agee with what Vlad was saying, although if I understand it correctly, the issue is that it most often converges to the "incorrect" solution and the question is how to make it converge to the "correct" solution more frequently. Changing the parameters (not just the number of features) could help and another option is to use scmap-cell instead. Although it is slower, it could potentially yield better results.
Dear Dr. Kiselev, Dr. Hemberg,
I have been encountering an issue when running the scmap-cluster pipeline and would like to ask you for some input on this matter. It seems that independent runs result in variable and inconsistent assignments. There seem to be two main sets of results, one of which seems fair based on prior knowledge about the composition of the query dataset. The second set of results comprises assignments that seem to reflect an imperfect match rather than a complete mess. While it might still be to some degree possible that the assignment that I think is problematic is actually correct, the issue remains of the variability in the mapping results. I should stress that the "wrong" mapping occurs much more frequently than the one I feel would be correct, if I run scmap from scratch repeatedly, suggesting it may be right after all. Still, the variability troubles me.
I have possibly narrowed down the steps resulting in variable outcomes to the reference normalization (see script below). I tried increasing the number of selected features thinking that this may mitigate the impact of the previous steps, but that didn´t seem to be the case.
sf2 <- 2^rnorm(ncol(sce2))
sf2 <- sf2/mean(sf2)
normcounts(sce2) <- t(t(counts(sce2))/sf2)
counts(sce2) <- normcounts(sce2)
logcounts(sce2) <- log2(normcounts(sce2)+1)
rowData(sce2)$feature_symbol<-rownames(sce2)
I would be happy to receive some suggestions from your side.
The text was updated successfully, but these errors were encountered: