KeyError: 'base' when running `tl.rank_genes_groups` #2239

naity2 · 2022-04-18T17:29:40Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of scanpy.
(optional) I have confirmed this bug exists on the master branch of scanpy.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Minimal code sample (that we can copy&paste without having any data)

# Your code here
sc.tl.rank_genes_groups(adata, "origin", method="wilcoxon")

[Paste the error output produced by the above code here]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1 sc.tl.rank_genes_groups(adata, "origin", method="wilcoxon")
      2 sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)

File ~/app/miniconda3/envs/bio/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:590, in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
    580 adata.uns[key_added] = {}
    581 adata.uns[key_added]['params'] = dict(
    582     groupby=groupby,
    583     reference=reference,
   (...)
    587     corr_method=corr_method,
    588 )
--> 590 test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
    592 if check_nonnegative_integers(test_obj.X) and method != 'logreg':
    593     logg.warning(
    594         "It seems you use rank_genes_groups on the raw count data. "
    595         "Please logarithmize your data before calling rank_genes_groups."
    596     )

File ~/app/miniconda3/envs/bio/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:93, in _RankGenes.__init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
     82 def __init__(
     83     self,
     84     adata,
   (...)
     90     comp_pts=False,
     91 ):
---> 93     if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
     94         self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
     95     else:

KeyError: 'base'

Versions

[Paste the output of scanpy.logging.print_versions() leaving a blank line after the details tag]

anndata 0.8.0
scanpy 1.9.1

Levenshtein NA
OpenSSL 21.0.0
PIL 9.1.0
adjustText NA
airr 1.3.1
appdirs 1.4.4
asttokens NA
attr 21.4.0
backcall 0.2.0
beta_ufunc NA
binom_ufunc NA
bioservices 1.8.4
boto3 1.21.42
botocore 1.24.42
brotli NA
bs4 4.11.1
cattr NA
certifi 2021.10.08
cffi 1.15.0
charset_normalizer 2.0.4
colorama 0.4.4
colorlog NA
cryptography 36.0.0
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
debugpy 1.6.0
decorator 5.1.1
defusedxml 0.7.1
easydev 0.12.0
entrypoints 0.4
executing 0.8.3
gseapy 0.10.8
h5py 3.6.0
hypergeom_ufunc NA
idna 3.3
igraph 0.9.10
ipykernel 6.13.0
ipython_genutils 0.2.0
ipywidgets 7.7.0
jedi 0.18.1
jmespath 1.0.0
joblib 1.1.0
jupyter_server 1.16.0
kiwisolver 1.4.2
leidenalg 0.8.9
llvmlite 0.38.0
lxml 4.8.0
matplotlib 3.5.1
matplotlib_inline NA
mpl_toolkits NA
natsort 8.1.0
nbinom_ufunc NA
networkx 2.8
numba 0.55.1
numpy 1.21.6
packaging 21.3
pandas 1.4.2
parasail 1.2.4
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pkg_resources NA
prompt_toolkit 3.0.29
psutil 5.9.0
ptyprocess 0.7.0
pure_eval 0.2.2
pycparser 2.21
pydev_ipython NA
pydevconsole NA
pydevd 2.8.0
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pyexpat NA
pygments 2.11.2
pylab NA
pynndescent 0.5.6
pyparsing 3.0.8
pytoml NA
pytz 2022.1
requests 2.27.1
requests_cache 0.9.2
scipy 1.8.0
scirpy 0.10.1
seaborn 0.11.2
session_info 1.0.0
setuptools_scm NA
six 1.16.0
sklearn 1.0.2
socks 1.7.1
soupsieve 2.3.2.post1
stack_data 0.2.0
statsmodels 0.13.2
texttable 1.6.4
threadpoolctl 3.1.0
tornado 6.1
tqdm 4.62.3
tracerlib NA
traitlets 5.1.1
typing_extensions NA
umap 0.5.3
url_normalize 1.4.3
urllib3 1.26.7
wcwidt

auesro · 2022-04-20T16:17:29Z

Same error here...any ideas?

-----
anndata     0.8.0
scanpy      1.8.2
sinfo       0.3.1
-----
PIL                         9.0.1
PyQt5                       NA
anndata                     0.8.0
anndata2ri                  0.0.0
atomicwrites                1.4.0
autoreload                  NA
backcall                    0.2.0
backports                   NA
beta_ufunc                  NA
binom_ufunc                 NA
bs4                         4.10.0
cached_property             1.5.2
cffi                        1.15.0
chardet                     4.0.0
cloudpickle                 2.0.0
colorama                    0.4.4
cycler                      0.10.0
cython_runtime              NA
cytoolz                     0.11.2
dask                        2022.02.0
dateutil                    2.8.2
debugpy                     1.5.1
decorator                   5.1.1
defusedxml                  0.7.1
dunamai                     1.10.0
entrypoints                 0.4
fsspec                      2022.02.0
get_version                 3.5.4
h5py                        3.6.0
igraph                      0.9.9
ipykernel                   6.9.1
jedi                        0.18.1
jinja2                      3.0.3
joblib                      1.1.0
kiwisolver                  1.3.2
leidenalg                   0.8.9
llvmlite                    0.38.0
louvain                     0.7.1
markupsafe                  2.1.0
matplotlib                  3.5.1
matplotlib_inline           NA
mpl_toolkits                NA
natsort                     8.1.0
nbinom_ufunc                NA
numba                       0.55.1
numexpr                     2.8.0
numpy                       1.21.5
packaging                   21.3
pandas                      1.3.5
parso                       0.8.3
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
prompt_toolkit              3.0.27
psutil                      5.9.0
ptyprocess                  0.7.0
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.6.0
pydevd_concurrency_analyser NA
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.11.2
pyparsing                   3.0.7
pytz                        2021.3
pytz_deprecation_shim       NA
rpy2                        3.4.2
scanpy                      1.8.2
scipy                       1.7.3
seaborn                     0.11.2
setuptools                  59.8.0
sinfo                       0.3.1
sip                         NA
six                         1.16.0
sklearn                     1.0.2
soupsieve                   2.3.1
sphinxcontrib               NA
spyder                      5.2.2
spyder_kernels              2.2.1
spydercustomize             NA
statsmodels                 0.13.2
storemagic                  NA
tables                      3.7.0
texttable                   1.6.4
threadpoolctl               3.1.0
tlz                         0.11.2
toolz                       0.11.2
tornado                     6.1
traitlets                   5.1.1
typing_extensions           NA
tzlocal                     NA
wcwidth                     0.2.5
wurlitzer                   3.0.2
yaml                        6.0
zipp                        NA
zmq                         22.3.0
-----
IPython             7.32.0
jupyter_client      7.1.2
jupyter_core        4.9.2
-----
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) [GCC 9.4.0]
Linux-5.4.0-109-generic-x86_64-with-debian-bullseye-sid
16 logical CPU cores, x86_64
-----
Session information updated at 2022-04-20 18:16

Koncopd · 2022-04-20T16:24:44Z

And what is in adata.uns['log1p']?

auesro · 2022-04-20T16:27:29Z

Its empty, there is no base key.

I just found issue #2181 which mentions the same issue and a workaround. In case @naity2 or someone else comes looking.

naity2 · 2022-04-20T16:54:10Z

Thank you @auesro!

For now, I use the line adata.uns['log1p']["base"] = None every time after reading a h5ad file.

brianpenghe · 2022-07-14T15:02:16Z

adata.uns['log1p']["base"] = None

Thank you. I also had this error when calculating highly variable genes sc.pp.highly_variable_genes(Adult,batch_key='batch')

giorgiatosoni · 2022-10-15T08:53:24Z

Still an issue also for me. Any news?

jasonleongbio · 2022-11-18T10:06:47Z

Trying out the tutorials these days and it seems this issue still persists.

Here is what I got from running the tutorial pbmc3k.ipynb:
Before writing the AnnData object to a .h5ad file (after the PCA step; before computing the neighborhood graph)

Inside adata.uns:

OverloadedDict, wrapping:
	OrderedDict([('log1p', {'base': None}), ('hvg', {'flavor': 'seurat'}), ('pca', {'params': {'zero_center': True, 'use_highly_variable': True}, 'variance': array([ (not showing the numbers for simplicity here) ],
      dtype=float32), 'variance_ratio': array([ (not showing the numbers for simplicity here) ],
      dtype=float32)})])
With overloaded keys:
	['neighbors'].

After loading the matrix from the .h5ad file:

Inside adata.uns, the log1p key became an empty dictionary:

OverloadedDict, wrapping:
	{'hvg': {'flavor': 'seurat'}, 'log1p': {}, 'pca': {'params': {'use_highly_variable': True, 'zero_center': True}, 'variance': array([ (not showing the numbers for simplicity here) ],
      dtype=float32), 'variance_ratio': array([ (not showing the numbers for simplicity here) ],
      dtype=float32)}}
With overloaded keys:
	['neighbors'].

m21camby · 2022-12-05T22:52:55Z

Although adata.uns['log1p']["base"] = None seems work for tl.rank_genes_groups the results is weird in my analysis. When I check, logfoldchange, values didn't make any sense. Some of them are almost near 100. Is there any case also or maybe I'm wrong.

lubianat · 2023-01-23T17:57:31Z

Just to let you know that the same issue happened here when running the tutorial with my data.

KoichiHashikawa · 2023-02-03T04:58:22Z

Same here. adata.uns['log1p']["base"] = None eliminated the error, but the FC seems weird.
I compared the FC results with Seurat FindMarker results, which used the same FC calcualtion. For most genes, Scanpy resulted in much higher FC (some gets 30 or more), which I have never seen.

KoichiHashikawa · 2023-02-16T18:12:36Z

@LuckyMD requires attentions to several of the threads above from Scanpy team.
Thanks!

zhenxingjian · 2023-05-08T22:11:50Z

Same error here.

flying-sheep · 2023-06-07T14:40:46Z

Duplicate of scverse/anndata#673

I have a fix waiting in scverse/anndata#999

adkinsrs mentioned this issue May 18, 2022

Group labeling headers show up before click on clustering step, single-cell wb IGS/gEAR#307

Closed

LuckyMD mentioned this issue May 19, 2022

Key Error "base" in section "marker genes & annotation" theislab/single-cell-tutorial#97

Closed

DriesSchaumont mentioned this issue Jul 15, 2022

Fix checking of log1p transformation base value when that value is None. #2294

Closed

maximilianh mentioned this issue Oct 12, 2022

Saving/reading .h5ad file results in altered adata.uns['log1p'] #2181

Closed

3 tasks

kohleman mentioned this issue Dec 15, 2022

KeyError: 'base' when using bc.tl.dge.get_de() or bc.st.additional_labeling() bedapub/besca#271

Open

flying-sheep closed this as not planned Won't fix, can't repro, duplicate, stale Jun 7, 2023

flying-sheep linked a pull request Jul 7, 2023 that will close this issue

Fix getting log1p base #2546

Merged

flying-sheep closed this as completed in #2546 Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: 'base' when running `tl.rank_genes_groups` #2239

KeyError: 'base' when running `tl.rank_genes_groups` #2239

naity2 commented Apr 18, 2022 •

edited

Loading

[Paste the output of scanpy.logging.print_versions() leaving a blank line after the details tag]

anndata 0.8.0
scanpy 1.9.1

auesro commented Apr 20, 2022

Koncopd commented Apr 20, 2022

auesro commented Apr 20, 2022

naity2 commented Apr 20, 2022

brianpenghe commented Jul 14, 2022

giorgiatosoni commented Oct 15, 2022

jasonleongbio commented Nov 18, 2022

m21camby commented Dec 5, 2022

lubianat commented Jan 23, 2023

KoichiHashikawa commented Feb 3, 2023

KoichiHashikawa commented Feb 16, 2023

zhenxingjian commented May 8, 2023

flying-sheep commented Jun 7, 2023

KeyError: 'base' when running tl.rank_genes_groups #2239

KeyError: 'base' when running tl.rank_genes_groups #2239

Comments

naity2 commented Apr 18, 2022 • edited Loading

Minimal code sample (that we can copy&paste without having any data)

Versions

[Paste the output of scanpy.logging.print_versions() leaving a blank line after the details tag]

anndata 0.8.0 scanpy 1.9.1

auesro commented Apr 20, 2022

Koncopd commented Apr 20, 2022

auesro commented Apr 20, 2022

naity2 commented Apr 20, 2022

brianpenghe commented Jul 14, 2022

giorgiatosoni commented Oct 15, 2022

jasonleongbio commented Nov 18, 2022

m21camby commented Dec 5, 2022

lubianat commented Jan 23, 2023

KoichiHashikawa commented Feb 3, 2023

KoichiHashikawa commented Feb 16, 2023

zhenxingjian commented May 8, 2023

flying-sheep commented Jun 7, 2023

KeyError: 'base' when running `tl.rank_genes_groups` #2239

KeyError: 'base' when running `tl.rank_genes_groups` #2239

naity2 commented Apr 18, 2022 •

edited

Loading

anndata 0.8.0
scanpy 1.9.1