Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cellbender on multiplexed chemistries #373

Open
LinearParadox opened this issue Jul 11, 2024 · 1 comment
Open

Cellbender on multiplexed chemistries #373

LinearParadox opened this issue Jul 11, 2024 · 1 comment

Comments

@LinearParadox
Copy link

LinearParadox commented Jul 11, 2024

So I had a question about running on multiplexed chemistries. Things like 10x flex, as well as CITE seq. At least with 10x flex, you have a pool of probes that can barcode a few samples. It comes in 16 and 4 sample probe pools. I was curious on whether you should run cellbender on the demultiplexed outputs (meaning cellbender sample 1-Pool1, cellbender sample2-Pool1, etc.). Or whether you should run cellbender on the file of all the probes in the pool. Meaning cellbender Pool1.h5ad, and then seperate the samples after. Intuitively, I would think it would perform better on the second option, because the entire probe pool is loaded onto the machine together, and is essentially one run. However, I get some weird performance when I do this:

on one side, I get a run that looks like it went ok training wise, but seems to call too many cells:

image
image

On the other hand, i get runs that seem like they went pretty poorly, and has steep drops in the training curve. I'm wondering if this is due to parameters that need to be adjusted to account for a larger sample, or whether we should run it on demultiplexed samples only.

@JThomasWatson
Copy link

JThomasWatson commented Nov 15, 2024

Hi. I'm not a member of the CellBender team, but recently had the same issue with scFRP data. I do think it wants to operate on a sample basis. This plot is from running one of our pools simultaneously.
Untitled
After changing to running CellBender on one sample at a time, the plots look much more like expected. The dips in the training curve you see were also present, but went away after switching to single-sample runs.
Untitled-1
I suspect the distributions of endogenous and exogenous counts get blurry when using a pool's aggregated data due to between-sample variance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants