Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with scmap-cluster function output/ Variable inconsistent assignment #27

Open
finjen opened this issue Feb 25, 2021 · 2 comments
Open

Comments

@finjen
Copy link

finjen commented Feb 25, 2021

Dear Dr. Kiselev, Dr. Hemberg,

I have been encountering an issue when running the scmap-cluster pipeline and would like to ask you for some input on this matter. It seems that independent runs result in variable and inconsistent assignments. There seem to be two main sets of results, one of which seems fair based on prior knowledge about the composition of the query dataset. The second set of results comprises assignments that seem to reflect an imperfect match rather than a complete mess. While it might still be to some degree possible that the assignment that I think is problematic is actually correct, the issue remains of the variability in the mapping results. I should stress that the "wrong" mapping occurs much more frequently than the one I feel would be correct, if I run scmap from scratch repeatedly, suggesting it may be right after all. Still, the variability troubles me.

I have possibly narrowed down the steps resulting in variable outcomes to the reference normalization (see script below). I tried increasing the number of selected features thinking that this may mitigate the impact of the previous steps, but that didn´t seem to be the case.

sf2 <- 2^rnorm(ncol(sce2))
sf2 <- sf2/mean(sf2)
normcounts(sce2) <- t(t(counts(sce2))/sf2)

counts(sce2) <- normcounts(sce2)
logcounts(sce2) <- log2(normcounts(sce2)+1)
rowData(sce2)$feature_symbol<-rownames(sce2)

I would be happy to receive some suggestions from your side.

@finjen finjen changed the title Issue with scmap-cluster function output Issue with scmap-cluster function output/ Variable inconsistent assignment Feb 25, 2021
@wikiselev
Copy link
Member

Hi, not sure that the steps in your script are stochastic... Also they do not include any scmap functions.

Regarding scmap-cluster function, if I remember correctly it is stochastic and therefore it will indeed give you different results for each run. One way to get a stable result is to average different cell assignments by taking the most frequent one after multiple runs.

Hope this helps!

@mhemberg
Copy link

Agee with what Vlad was saying, although if I understand it correctly, the issue is that it most often converges to the "incorrect" solution and the question is how to make it converge to the "correct" solution more frequently. Changing the parameters (not just the number of features) could help and another option is to use scmap-cell instead. Although it is slower, it could potentially yield better results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants