Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix garbage collection #1119

Merged
merged 40 commits into from
Jul 22, 2024
Merged

Fix garbage collection #1119

merged 40 commits into from
Jul 22, 2024

Conversation

ivirshup
Copy link
Member

@ivirshup ivirshup commented Sep 4, 2023

Still needs some tests.

Demo

Set up
import anndata as ad
import numpy as np

ANNDATA_FILENAME = 'test.h5ad'

X = np.ones((10_000, 10_000))
ad.AnnData(
    X,
    layers={"X_again": X}
).write_h5ad(ANNDATA_FILENAME, compression="lzf")
del X
Benchmarking script
import anndata as ad
import numpy as np
import os
import tracemalloc

ANNDATA_FILENAME = 'test.h5ad'
RUNS = 10

def display_top(snapshot, key_type='lineno'):
    snapshot = snapshot.filter_traces((
        tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
        tracemalloc.Filter(False, "<unknown>"),
    ))
    top_stats = snapshot.statistics(key_type)
    total = sum(stat.size for stat in top_stats)
    return total

def trace_function(func, arg, n):
    total = np.zeros(n)
    data = func(arg).copy()
    tracemalloc.start()
    for i in range(n):
        data = func(arg).copy()
        snapshot = tracemalloc.take_snapshot()
        total[i] = display_top(snapshot)
    tracemalloc.stop()
    return total

def func(pth):
    data = ad.read_h5ad(pth)
    return data[::2].copy()

total = trace_function(func, ANNDATA_FILENAME, RUNS)
total

Memory usage on main:

array([3.20191228e+09, 4.00357741e+09, 4.80520541e+09, 5.60682786e+09,
       4.00358264e+09, 4.80520492e+09, 5.60682752e+09, 6.40845017e+09,
       7.21007283e+09, 8.01169533e+09])

Memory usage on this branch:

array([2.40190353e+09, 2.40195236e+09, 2.40196104e+09, 2.40196308e+09,
       2.40196497e+09, 2.40196565e+09, 2.40196656e+09, 2.40196712e+09,
       2.40196768e+09, 2.40196820e+09])

@codecov
Copy link

codecov bot commented Sep 4, 2023

Codecov Report

Attention: Patch coverage is 96.17834% with 6 lines in your changes missing coverage. Please review.

Project coverage is 83.92%. Comparing base (b2bdd7f) to head (e138fc9).
Report is 75 commits behind head on main.

Files with missing lines Patch % Lines
src/anndata/_core/aligned_mapping.py 96.22% 4 Missing ⚠️
src/anndata/_core/raw.py 93.75% 1 Missing ⚠️
...anndata/experimental/multi_files/_anncollection.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1119      +/-   ##
==========================================
- Coverage   85.95%   83.92%   -2.04%     
==========================================
  Files          36       36              
  Lines        5747     5823      +76     
==========================================
- Hits         4940     4887      -53     
- Misses        807      936     +129     
Flag Coverage Δ
gpu-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/anndata/_core/anndata.py 83.63% <100.00%> (-1.21%) ⬇️
src/anndata/_core/file_backing.py 91.26% <100.00%> (+1.04%) ⬆️
src/anndata/_io/write.py 76.74% <ø> (ø)
src/anndata/tests/helpers.py 81.73% <100.00%> (-4.50%) ⬇️
src/anndata/_core/raw.py 76.97% <93.75%> (-6.60%) ⬇️
...anndata/experimental/multi_files/_anncollection.py 70.68% <0.00%> (ø)
src/anndata/_core/aligned_mapping.py 93.36% <96.22%> (+1.46%) ⬆️

... and 9 files with indirect coverage changes

Copy link

scverse-benchmark bot commented Jun 3, 2024

Benchmark changes

Change Before [b2bdd7f] After [e138fc9] Ratio Benchmark (Parameter)
- 1.31359e+07 1.70902e+06 0.13 anndata.GarbargeCollectionSuite.track_peakmem_garbage_collection
+ 1.20863 1.51659 1.25 readwrite.H5ADReadSuite.track_read_full_memratio('pbmc3k')

Comparison: https://github.com/scverse/anndata/compare/b2bdd7f926d54c9ae7a1b56a4e97d37e6e4d1dad..e138fc99fa4d5ab7e9351a45b905e13a4fc2415f
Last changed: Mon, 22 Jul 2024 12:07:58 +0000

More details: https://github.com/scverse/anndata/pull/1119/checks?check_run_id=27745934862

@ilan-gold
Copy link
Contributor

Two big outstanding problems:

  1. The following sort of thing no longer works
AnnData.layers
  1. i/o memory has spiked according to benchmarking

@ilan-gold
Copy link
Contributor

Looking at this further: https://github.com/scverse/anndata/pull/1119/checks?check_run_id=25779040354 there is actually no improvement over main. So I am not sure what's up with that.

class GarbargeCollectionSuite:
runs = 10

# https://github.com/pythonprofilers/memory_profiler/issues/402 and other backend does not pick this up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @ivirshup @flying-sheep independent of this issue, were you aware that https://pypi.org/project/memory-profiler/ is a line-by-line profiler? I think this is probably not very good for us, no? When I read get_peakmem (i.e., the function in utils we wrote based on memory_profiler), I would think that it would really track the literal peak memory, and not the peak over individual line operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilan-gold
Copy link
Contributor

@flying-sheep I would be ready to go on this

@flying-sheep flying-sheep modified the milestones: 0.10.8, 0.10.9 Jun 18, 2024
@flying-sheep
Copy link
Member

I would still like the things in here addressed: #1119 (review)

I fixed everything that I didn’t talk about there, but

  1. I’m still unclear on why the changes help / what they’re supposed to do
  2. The double indirection with its “only pass by reference“ structure seems fragile and possibly too complex for me. If someone can answer 1., the complexity argument could go away, but it’s still fragile.

@ilan-gold
Copy link
Contributor

TODO: try using a weakref within the axis arrays

@flying-sheep
Copy link
Member

Hm, maybe this is just a complex problem and can’t be simplified much more without making other parts more complex.

The issue here is the complexity of the paths that data takes:

  1. when initializing a new AnnData object, AlignedMappingProperty.__set__ creates the AnnData._{name} attribute as a simple dict. __get__ then initializes the ephemeral AlignedMapping & *Actual¹ objects with it.

here my argument applies: shouldn’t there be a source of truth where that dict is the canonical data container and only one reference to it is stored?

  1. when copying an AnnData object, we currently .copy() the ephemeral AlignedMapping & *Actual object, temporarily making it the data holder, and only then we assign the attribute on the new AnnData object in AlignedMappingProperty.__set__, which works, but extends the responsibility for the AxisArrays beyond “ephemeral API wrapper around dict stored in AnnData object.

¹Classes like Layers, … that have a _data attribute they share with an AnnData object.

@flying-sheep flying-sheep merged commit d51f84c into scverse:main Jul 22, 2024
15 of 16 checks passed
flying-sheep pushed a commit that referenced this pull request Jul 22, 2024
@scverse scverse deleted a comment from lumberbot-app bot Jul 22, 2024
flying-sheep added a commit that referenced this pull request Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Anndata not properly garbage collected
3 participants