read harmonypy data can not find deg #2440

qwelongh · 2023-03-09T02:45:41Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of scanpy.
(optional) I have confirmed this bug exists on the master branch of scanpy.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Minimal code sample (that we can copy&paste without having any data)

import harmonypy as hm
h_data = hm.run_harmony(a_data.obsm['X_pca'], a_data.obs, ['donor'])
har= h_data.Z_corr
har = har.T
a_data.obsm['X_harmony'] = har.copy()
sc.pp.neighbors(a_data, n_neighbors=30, n_pcs=50,use_rep='X_harmony')
res = 1
sc.tl.leiden(a_data,resolution=res,key_added = 'leiden_res_harmony%.2f'%res)
a_data.write(test.h5ad)
a_data = sc.read(test.h5ad)
sc.tl.rank_genes_groups(a_data, 'condition', method='wilcoxon')

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-8d70f4e1a0fa> in <module>
----> 1 sc.tl.rank_genes_groups(a_data, 'condition', method='wilcoxon')

/opt/conda/lib/python3.8/site-packages/scanpy/tools/_rank_genes_groups.py in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
    588     )
    589 
--> 590     test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
    591 
    592     if check_nonnegative_integers(test_obj.X) and method != 'logreg':

/opt/conda/lib/python3.8/site-packages/scanpy/tools/_rank_genes_groups.py in __init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
     91     ):
     92 
---> 93         if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
     94             self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
     95         else:

KeyError: 'base'

Versions

[Paste the output of scanpy.logging.print_versions() leaving a blank line after the details tag]

CCranney · 2023-07-26T19:27:56Z

I have also encountered this error, but specifically in the scanpy tutorial outlined here. That should make reproducibility easier.

The error occurs under the Finding Marker Genes heading, specifically the following line:

sc.tl.rank_genes_groups(adata, 'leiden', groups=['0'], reference='1', method='wilcoxon')
sc.pl.rank_genes_groups(adata, groups=['0'], n_genes=20)

With an error output of the following:

ranking genes
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[49], line 1
----> 1 sc.tl.rank_genes_groups(adata, 'leiden', groups=['0'], reference='1', method='wilcoxon')
      2 sc.pl.rank_genes_groups(adata, groups=['0'], n_genes=20)

File ~/Desktop/data/env/lib/python3.11/site-packages/scanpy/tools/_rank_genes_groups.py:590, in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
    580 adata.uns[key_added] = {}
    581 adata.uns[key_added]['params'] = dict(
    582     groupby=groupby,
    583     reference=reference,
   (...)
    587     corr_method=corr_method,
    588 )
--> 590 test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
    592 if check_nonnegative_integers(test_obj.X) and method != 'logreg':
    593     logg.warning(
    594         "It seems you use rank_genes_groups on the raw count data. "
    595         "Please logarithmize your data before calling rank_genes_groups."
    596     )

File ~/Desktop/data/env/lib/python3.11/site-packages/scanpy/tools/_rank_genes_groups.py:93, in _RankGenes.__init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
     82 def __init__(
     83     self,
     84     adata,
   (...)
     90     comp_pts=False,
     91 ):
---> 93     if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
     94         self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
     95     else:

KeyError: 'base'

I've pasted the output of scanpy.logging.print_versions() details below as requested, which includes a verification of used scanpy version (latest version, 1.9.3).

It may not be important, but I also had to install leidenalg manually in the middle of the tutorial. That's the only deviation I made from the original tutorial.

-----
anndata     0.9.2
scanpy      1.9.3
-----
PIL                 10.0.0
appnope             0.1.3
asttokens           NA
backcall            0.2.0
comm                0.1.3
cycler              0.10.0
cython_runtime      NA
dateutil            2.8.2
debugpy             1.6.7
decorator           5.1.1
executing           1.2.0
h5py                3.9.0
igraph              0.10.6
ipykernel           6.25.0
jedi                0.18.2
joblib              1.3.1
kiwisolver          1.4.4
leidenalg           0.10.1
llvmlite            0.40.1
matplotlib          3.7.2
mpl_toolkits        NA
natsort             8.4.0
numba               0.57.1
numpy               1.24.4
packaging           23.1
pandas              2.0.3
parso               0.8.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
platformdirs        3.9.1
prompt_toolkit      3.0.39
psutil              5.9.5
ptyprocess          0.7.0
pure_eval           0.2.2
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.15.1
pyparsing           3.0.9
pytz                2023.3
scipy               1.11.1
session_info        1.0.0
sitecustomize       NA
six                 1.16.0
sklearn             1.3.0
stack_data          0.6.2
texttable           1.6.7
threadpoolctl       3.2.0
tornado             6.3.2
traitlets           5.9.0
wcwidth             0.2.6
zmq                 25.1.0
-----
IPython             8.14.0
jupyter_client      8.3.0
jupyter_core        5.3.1
-----
Python 3.11.4 (main, Jun 20 2023, 17:23:00) [Clang 14.0.3 (clang-1403.0.22.14.1)]
macOS-13.5-arm64-i386-64bit
-----
Session information updated at 2023-07-26 10:47

flying-sheep · 2023-07-27T12:10:33Z

duplicate of #2181

flying-sheep closed this as completed Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read harmonypy data can not find deg #2440

read harmonypy data can not find deg #2440

qwelongh commented Mar 9, 2023 •

edited

Loading

CCranney commented Jul 26, 2023 •

edited by flying-sheep

Loading

flying-sheep commented Jul 27, 2023

read harmonypy data can not find deg #2440

read harmonypy data can not find deg #2440

Comments

qwelongh commented Mar 9, 2023 • edited Loading

Minimal code sample (that we can copy&paste without having any data)

Versions

CCranney commented Jul 26, 2023 • edited by flying-sheep Loading

flying-sheep commented Jul 27, 2023

qwelongh commented Mar 9, 2023 •

edited

Loading

CCranney commented Jul 26, 2023 •

edited by flying-sheep

Loading