Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce context usage of Container within migrator #6741

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

agoscinski
Copy link
Contributor

@agoscinski agoscinski commented Jan 31, 2025

As mentioned in issue #6739, this is to enforce the correct usage of the container so all acquired resources are correctly released.

TODO Need to test with py313 if open file descriptors are closed works

This PR goes together with aiidateam/disk-objectstore#179

@agoscinski agoscinski force-pushed the fix-container-handling branch from 701e658 to f3f41c1 Compare February 2, 2025 08:23
@agoscinski agoscinski force-pushed the fix-container-handling branch from 2a9ce37 to eb6e1b4 Compare February 2, 2025 08:50
@agoscinski agoscinski force-pushed the fix-container-handling branch from 9b82abd to d4f15b1 Compare February 2, 2025 09:16
@unkcpz
Copy link
Member

unkcpz commented Feb 2, 2025

Interesting, it seems using context manager to close the dos container is a footgun design, in principle the resource is only needed when doing real file system operations.
I don't have such problem in the rust implementation, the container there is simply a link to the folder ;) https://github.com/unkcpz/rsdos

@agoscinski agoscinski force-pushed the fix-container-handling branch from 3d2de49 to 978951e Compare February 2, 2025 09:36
Copy link

codecov bot commented Feb 2, 2025

Codecov Report

Attention: Patch coverage is 72.97297% with 10 lines in your changes missing coverage. Please review.

Project coverage is 21.23%. Comparing base (208d6a9) to head (17ef54f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/aiida/storage/psql_dos/migrator.py 75.00% 5 Missing ⚠️
src/aiida/storage/sqlite_dos/backend.py 64.29% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #6741       +/-   ##
===========================================
- Coverage   78.09%   21.23%   -56.86%     
===========================================
  Files         564      564               
  Lines       42544    42558       +14     
===========================================
- Hits        33219     9031    -24188     
- Misses       9325    33527    +24202     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agoscinski
Copy link
Contributor Author

agoscinski commented Feb 2, 2025

Interesting, it seems using context manager to close the dos container is a footgun design, in principle the resource is only needed when doing real file system operations.

I don't fully understand. My impression is that current code creating engines and sessions by the various initialization of Container is bad design. This results in connections being never explicitly closed. Somehow until Python 3.13 the garbage collection cleaned that up but in the Python 3.13 PR #6600 I had to deal with 5000 file descriptor that are unclosed in the tests resulting in the ssh connection to fail because it cannot open another one. One could keep reusing one session by using one single instance of container, but I wanted to first understand if this would work. Also the it is not clear to me how the Container is supposed to be used. Again public function that just open sessions without keeping track of them makes the usage unclear to me.

@unkcpz
Copy link
Member

unkcpz commented Feb 2, 2025

My impression is that current code creating engines and sessions by the various initialization of Container is bad design. This results in connections being never explicitly closed.

If you look at the __exit__ of disk-objectstore container, it close the DB session but the open part is not in the __enter__. The initialization is however fine I think since it just finds the folder and create some files inside. The problem is for the loose storage, there is no need to have the DB session. The DB session is for packs files only if I understand the design correctly but the container has to be used with the SQLite session open all the way.

Again public function that just open sessions without keeping track of them makes the usage unclear to me.

Yes, for the moment, I think just make sure the resource is managed correctly. I'll see if I can abstract the disk-objectstore layer so it can easily attach with other design, even just for test purpose. Anyway, pin me to review this one when it is ready.

@agoscinski
Copy link
Contributor Author

If you look at the exit of disk-objectstore container, it close the DB session but the open part is not in the enter. The initialization is however fine I think since it just finds the folder and create some files inside. The problem is for the loose storage, there is no need to have the DB session. The DB session is for packs files only if I understand the design correctly but the container has to be used with the SQLite session open all the way.

We could put the context into the functions of the Container where the resources are acquired which seems to be only at diskobject_store.Container.init_container. Then we put also the diskobject_store.database.get_session into that function diskobject_store.Container.init_container. Then we only need a context when handling the db resources. And we remove the usage of the db as you mention

I'll see if I can abstract the disk-objectstore layer so it can easily attach with other design, even just for test purpose.

in a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants