anndata write or anndata read seem to be losing `uns["log1p"]["base"]` when value for key `base` is None #865

pcm32 · 2022-11-30T13:37:28Z

I have noticed that on Scanpy, when setting andata.uns["log1p"]["base"] = None and then the object is written to disk and then read again, then base is no longer a key in andata.uns["log1p"]. This has implications in a number of downstream Scanpy methods when writing to disk in the middle and then reading back again, as maybe parts of scanpy seek to do:

if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:

in various places. Maybe the underlying problem is that sc.pp.log1p(adata) is not marking base as math.e in uns. I wonder if the write/read process might be prunning other keys that have None values?

The text was updated successfully, but these errors were encountered:

brainfo · 2022-12-19T14:10:37Z

Notice this problem also and agree to have this feature to avoid one more explicit step to avoid errors of detecting log1p but no base in it.

github-actions · 2023-06-14T02:27:14Z

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

fbnrst · 2023-06-14T08:34:12Z

I'm still affected by this issue. Without a setting for base, rank_genes_groups will not work.
Steps to reproduce:

import scanpy as sc

adata = sc.datasets.blobs()
sc.pp.log1p(adata)
adata.uns['log1p']

Output:

{'base': None}

Now, writing adata to file and reading from file:

adata.write('adata.h5ad')
adata = sc.read('adata.h5ad')
adata.uns['log1p']

Output:

{}

So, the entry for base is gone. Now,

sc.tl.rank_genes_groups(adata, 'blobs')

throws this error:

WARNING: Default of the method has been changed to 't-test' from 't-test_overestim_var'

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[16], line 1
----> 1 sc.tl.rank_genes_groups(adata, 'blobs')

File /opt/conda/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:590, in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
    580 adata.uns[key_added] = {}
    581 adata.uns[key_added]['params'] = dict(
    582     groupby=groupby,
    583     reference=reference,
   (...)
    587     corr_method=corr_method,
    588 )
--> 590 test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
    592 if check_nonnegative_integers(test_obj.X) and method != 'logreg':
    593     logg.warning(
    594         "It seems you use rank_genes_groups on the raw count data. "
    595         "Please logarithmize your data before calling rank_genes_groups."
    596     )

File /opt/conda/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:93, in _RankGenes.__init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
     82 def __init__(
     83     self,
     84     adata,
   (...)
     90     comp_pts=False,
     91 ):
---> 93     if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
     94         self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
     95     else:

KeyError: 'base'

Output of sc.logging.print_versions():

-----
anndata     0.9.1
scanpy      1.9.3
-----
PIL                 9.2.0
anyio               NA
arrow               1.2.3
asciitree           NA
asttokens           NA
astunparse          1.6.3
attr                23.1.0
babel               2.12.1
backcall            0.2.0
bottleneck          1.3.7
brotli              NA
certifi             2023.05.07
cffi                1.15.1
chardet             5.1.0
charset_normalizer  2.1.1
cloudpickle         2.2.1
colorama            0.4.6
comm                0.1.3
cycler              0.10.0
cython_runtime      NA
cytoolz             0.12.0
dask                2023.4.1
dateutil            2.8.2
debugpy             1.6.7
decorator           5.1.1
defusedxml          0.7.1
dill                0.3.6
entrypoints         0.4
executing           1.2.0
fasteners           0.17.3
fastjsonschema      NA
fqdn                NA
gmpy2               2.1.2
google              NA
h5py                3.8.0
idna                3.4
igraph              0.10.4
importlib_resources NA
ipykernel           6.23.0
ipython_genutils    0.2.0
isoduration         NA
jedi                0.18.2
jinja2              3.0.3
joblib              1.2.0
json5               NA
jsonpointer         2.0
jsonschema          4.17.3
jupyter_events      0.6.3
jupyter_server      2.5.0
jupyterlab_server   2.22.1
kiwisolver          1.4.4
leidenalg           0.9.1
llvmlite            0.39.1
louvain             0.8.0
lz4                 4.3.2
markupsafe          2.1.2
matplotlib          3.7.1
mpl_toolkits        NA
mpmath              1.3.0
msgpack             1.0.5
natsort             8.3.1
nbformat            5.8.0
numba               0.56.4
numcodecs           0.11.0
numexpr             2.7.3
numpy               1.23.5
opt_einsum          v3.3.0
packaging           23.1
pandas              1.5.0
parso               0.8.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
platformdirs        3.5.0
plotly              5.14.1
prometheus_client   NA
prompt_toolkit      3.0.38
psutil              5.9.5
ptyprocess          0.7.0
pure_eval           0.2.2
pvectorc            NA
pyarrow             10.0.1
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.15.1
pyparsing           3.0.9
pyrsistent          NA
pythonjsonlogger    NA
pytz                2023.3
requests            2.29.0
rfc3339_validator   0.1.4
rfc3986_validator   0.1.1
scipy               1.10.1
send2trash          NA
session_info        1.0.0
setuptools          67.7.2
six                 1.16.0
sklearn             1.2.2
sniffio             1.3.0
socks               1.7.1
sparse              0.14.0
sphinxcontrib       NA
stack_data          0.6.2
sympy               1.11.1
tblib               1.7.0
texttable           1.6.7
threadpoolctl       3.1.0
tlz                 0.12.0
toolz               0.12.0
torch               2.0.0
tornado             6.3
tqdm                4.65.0
traitlets           5.9.0
typing_extensions   NA
unicodedata2        NA
uri_template        NA
urllib3             1.26.15
wcwidth             0.2.6
webcolors           1.13
websocket           1.5.1
yaml                5.4.1
zarr                2.14.2
zipp                NA
zmq                 25.0.2
zoneinfo            NA
zope                NA
-----
IPython             8.13.2
jupyter_client      8.2.0
jupyter_core        5.3.0
jupyterlab          3.6.3
notebook            6.5.4
-----
Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:45:29) [GCC 10.4.0]
Linux-5.15.0-71-generic-x86_64-with-glibc2.31
-----
Session information updated at 2023-06-14 10:31

flying-sheep · 2023-06-15T10:15:56Z

Duplicate of #673

PR exists: #999

alam-shahul · 2023-06-26T22:29:01Z

Wait, so why won't this be fixed?

flying-sheep · 2023-06-27T11:16:11Z

That’s not what that means. Hover your mouse over “not planned” and you’ll see:

Since this issue report is a duplicate, issue #865 (this thread) will receive no further attention, but the issue this duplicates (#673) will.

github-actions bot added the stale label Jun 14, 2023

github-actions bot removed the stale label Jun 15, 2023

flying-sheep closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anndata write or anndata read seem to be losing `uns["log1p"]["base"]` when value for key `base` is None #865

anndata write or anndata read seem to be losing `uns["log1p"]["base"]` when value for key `base` is None #865

pcm32 commented Nov 30, 2022

brainfo commented Dec 19, 2022

github-actions bot commented Jun 14, 2023

fbnrst commented Jun 14, 2023

flying-sheep commented Jun 15, 2023

alam-shahul commented Jun 26, 2023

flying-sheep commented Jun 27, 2023 •

edited

Loading

anndata write or anndata read seem to be losing uns["log1p"]["base"] when value for key base is None #865

anndata write or anndata read seem to be losing uns["log1p"]["base"] when value for key base is None #865

Comments

pcm32 commented Nov 30, 2022

brainfo commented Dec 19, 2022

github-actions bot commented Jun 14, 2023

fbnrst commented Jun 14, 2023

flying-sheep commented Jun 15, 2023

alam-shahul commented Jun 26, 2023

flying-sheep commented Jun 27, 2023 • edited Loading

anndata write or anndata read seem to be losing `uns["log1p"]["base"]` when value for key `base` is None #865

anndata write or anndata read seem to be losing `uns["log1p"]["base"]` when value for key `base` is None #865

flying-sheep commented Jun 27, 2023 •

edited

Loading