Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anndata write or anndata read seem to be losing uns["log1p"]["base"] when value for key base is None #865

Closed
pcm32 opened this issue Nov 30, 2022 · 6 comments

Comments

@pcm32
Copy link

pcm32 commented Nov 30, 2022

I have noticed that on Scanpy, when setting andata.uns["log1p"]["base"] = None and then the object is written to disk and then read again, then base is no longer a key in andata.uns["log1p"]. This has implications in a number of downstream Scanpy methods when writing to disk in the middle and then reading back again, as maybe parts of scanpy seek to do:

if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:

in various places. Maybe the underlying problem is that sc.pp.log1p(adata) is not marking base as math.e in uns. I wonder if the write/read process might be prunning other keys that have None values?

@brainfo
Copy link

brainfo commented Dec 19, 2022

Notice this problem also and agree to have this feature to avoid one more explicit step to avoid errors of detecting log1p but no base in it.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

@github-actions github-actions bot added the stale label Jun 14, 2023
@fbnrst
Copy link
Contributor

fbnrst commented Jun 14, 2023

I'm still affected by this issue. Without a setting for base, rank_genes_groups will not work.
Steps to reproduce:

import scanpy as sc

adata = sc.datasets.blobs()
sc.pp.log1p(adata)
adata.uns['log1p']

Output:

{'base': None}

Now, writing adata to file and reading from file:

adata.write('adata.h5ad')
adata = sc.read('adata.h5ad')
adata.uns['log1p']

Output:

{}

So, the entry for base is gone. Now,

sc.tl.rank_genes_groups(adata, 'blobs')

throws this error:

WARNING: Default of the method has been changed to 't-test' from 't-test_overestim_var'

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[16], line 1
----> 1 sc.tl.rank_genes_groups(adata, 'blobs')

File /opt/conda/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:590, in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
    580 adata.uns[key_added] = {}
    581 adata.uns[key_added]['params'] = dict(
    582     groupby=groupby,
    583     reference=reference,
   (...)
    587     corr_method=corr_method,
    588 )
--> 590 test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
    592 if check_nonnegative_integers(test_obj.X) and method != 'logreg':
    593     logg.warning(
    594         "It seems you use rank_genes_groups on the raw count data. "
    595         "Please logarithmize your data before calling rank_genes_groups."
    596     )

File /opt/conda/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py:93, in _RankGenes.__init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
     82 def __init__(
     83     self,
     84     adata,
   (...)
     90     comp_pts=False,
     91 ):
---> 93     if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
     94         self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
     95     else:

KeyError: 'base'

Output of sc.logging.print_versions():

-----
anndata     0.9.1
scanpy      1.9.3
-----
PIL                 9.2.0
anyio               NA
arrow               1.2.3
asciitree           NA
asttokens           NA
astunparse          1.6.3
attr                23.1.0
babel               2.12.1
backcall            0.2.0
bottleneck          1.3.7
brotli              NA
certifi             2023.05.07
cffi                1.15.1
chardet             5.1.0
charset_normalizer  2.1.1
cloudpickle         2.2.1
colorama            0.4.6
comm                0.1.3
cycler              0.10.0
cython_runtime      NA
cytoolz             0.12.0
dask                2023.4.1
dateutil            2.8.2
debugpy             1.6.7
decorator           5.1.1
defusedxml          0.7.1
dill                0.3.6
entrypoints         0.4
executing           1.2.0
fasteners           0.17.3
fastjsonschema      NA
fqdn                NA
gmpy2               2.1.2
google              NA
h5py                3.8.0
idna                3.4
igraph              0.10.4
importlib_resources NA
ipykernel           6.23.0
ipython_genutils    0.2.0
isoduration         NA
jedi                0.18.2
jinja2              3.0.3
joblib              1.2.0
json5               NA
jsonpointer         2.0
jsonschema          4.17.3
jupyter_events      0.6.3
jupyter_server      2.5.0
jupyterlab_server   2.22.1
kiwisolver          1.4.4
leidenalg           0.9.1
llvmlite            0.39.1
louvain             0.8.0
lz4                 4.3.2
markupsafe          2.1.2
matplotlib          3.7.1
mpl_toolkits        NA
mpmath              1.3.0
msgpack             1.0.5
natsort             8.3.1
nbformat            5.8.0
numba               0.56.4
numcodecs           0.11.0
numexpr             2.7.3
numpy               1.23.5
opt_einsum          v3.3.0
packaging           23.1
pandas              1.5.0
parso               0.8.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
platformdirs        3.5.0
plotly              5.14.1
prometheus_client   NA
prompt_toolkit      3.0.38
psutil              5.9.5
ptyprocess          0.7.0
pure_eval           0.2.2
pvectorc            NA
pyarrow             10.0.1
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.15.1
pyparsing           3.0.9
pyrsistent          NA
pythonjsonlogger    NA
pytz                2023.3
requests            2.29.0
rfc3339_validator   0.1.4
rfc3986_validator   0.1.1
scipy               1.10.1
send2trash          NA
session_info        1.0.0
setuptools          67.7.2
six                 1.16.0
sklearn             1.2.2
sniffio             1.3.0
socks               1.7.1
sparse              0.14.0
sphinxcontrib       NA
stack_data          0.6.2
sympy               1.11.1
tblib               1.7.0
texttable           1.6.7
threadpoolctl       3.1.0
tlz                 0.12.0
toolz               0.12.0
torch               2.0.0
tornado             6.3
tqdm                4.65.0
traitlets           5.9.0
typing_extensions   NA
unicodedata2        NA
uri_template        NA
urllib3             1.26.15
wcwidth             0.2.6
webcolors           1.13
websocket           1.5.1
yaml                5.4.1
zarr                2.14.2
zipp                NA
zmq                 25.0.2
zoneinfo            NA
zope                NA
-----
IPython             8.13.2
jupyter_client      8.2.0
jupyter_core        5.3.0
jupyterlab          3.6.3
notebook            6.5.4
-----
Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:45:29) [GCC 10.4.0]
Linux-5.15.0-71-generic-x86_64-with-glibc2.31
-----
Session information updated at 2023-06-14 10:31

@github-actions github-actions bot removed the stale label Jun 15, 2023
@flying-sheep
Copy link
Member

Duplicate of #673

PR exists: #999

@flying-sheep flying-sheep closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2023
@alam-shahul
Copy link

Wait, so why won't this be fixed?

@flying-sheep
Copy link
Member

flying-sheep commented Jun 27, 2023

That’s not what that means. Hover your mouse over “not planned” and you’ll see:

“Won't fix, can't repro, duplicate, stale”

Since this issue report is a duplicate, issue #865 (this thread) will receive no further attention, but the issue this duplicates (#673) will.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants