Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review db connections in API and background tasks #649

Closed
tcompa opened this issue Apr 27, 2023 · 5 comments · Fixed by #651
Closed

Review db connections in API and background tasks #649

tcompa opened this issue Apr 27, 2023 · 5 comments · Fixed by #651
Labels
High Priority Current Priorities & Blocking Issues

Comments

@tcompa
Copy link
Collaborator

tcompa commented Apr 27, 2023

Ref #647

  1. Why is the apply-workflow endpoint using both sync/async session? Can we simplify it? Do we need an additional close after refresh? If we keep two sessions, both must be closed (and the sync one is currently not closed!).
  2. In submit_workflow (which takes place in background), the db_sync.commit statement before the long workflow execution is not fully closing the session. This is visible via the presence of a non-trivial identity map. Adding an explicit db_sync.close seems to work, and then the session is re-opened upon merge. Shall we move towards context-managers? Probably that's the best way to go.

Point 1 is a bug. Point 2 is a bit unclear, but for sure it requires some update.

@tcompa tcompa added the High Priority Current Priorities & Blocking Issues label Apr 27, 2023
@tcompa
Copy link
Collaborator Author

tcompa commented Apr 28, 2023

For the record: the following test passes, which (unexpectedly) shows that the new_db_sync.commit() is not enough to fully clear a DB connection.

async def test_sync_db_close(db_sync, db):
    """
    This test is used as an example to show that the commit() method is not
    enough to fully close a db_sync connection.
    """

    from fractal_server.app.models.task import Task
    from fractal_server.app.db import DB

    db.add(
        Task(
            name="mytask",
            input_type="image",
            output_type="zarr",
            command="cmd",
            source="/source",
        )
    )
    await db.commit()
    await db.close()

    new_db_sync = next(DB.get_sync_db())
    task = new_db_sync.get(Task, 1)
    debug(task)
    task.name = "new_name"
    new_db_sync.merge(task)
    new_db_sync.commit()
    debug(new_db_sync.identity_map)
    assert new_db_sync.identity_map
    new_db_sync.close()
    debug(new_db_sync.identity_map)
    assert not new_db_sync.identity_map

@tcompa
Copy link
Collaborator Author

tcompa commented Apr 28, 2023

Also, for the record:

Context manager use is optional; otherwise, the returned Session object may be closed explicitly via the Session.close() method. Using a try:/finally: block is optional, however will ensure that the close takes place even if there are database errors:
(https://docs.sqlalchemy.org/en/14/orm/session_api.html#sqlalchemy.orm.sessionmaker)

@tcompa tcompa changed the title Placeholder: review db connections Review db connections in API and background tasks Apr 28, 2023
@tcompa tcompa linked a pull request Apr 28, 2023 that will close this issue
tcompa added a commit that referenced this issue Apr 28, 2023
In the new version, `submit_workflow` takes IDs as arguments, and not
ORM objects. The objects are then fetched from the db inside the
function, which makes their use more transparent.
@tcompa
Copy link
Collaborator Author

tcompa commented Apr 28, 2023

Ref (related to "ghost" connections that seems to always be present):

@tcompa tcompa reopened this Apr 28, 2023
@tcompa
Copy link
Collaborator Author

tcompa commented Apr 28, 2023

Recap of current status:

  1. With Review DB connections #651, we (me and @mfranzon) identified and solved a few issues/bugs in handling db sessions, that were the most likely cause of Database locked after unknown issue #647. Some of these updates can be re-written as in Add finally block to close db session, at the end of get_db of get_sync_db #655. This is all independent on the sqlite/postgres choice.
  2. As of version 1.2.3, the results of the stress test discussed in Database locked after unknown issue #647 changed:
    • With sqlite, we can still hit a database locked (either with uvicorn or gunicorn), even though it requires a more rare situation (as defined in terms of how many simultaneous calls are made to the apply-workflow endpoint). Likely related to Review use of threadsafety in sqlite #656.
    • With postgres, we are unable to break the running fractal-server instance (either with uvicorn or gunicorn). The test we ran includes up to 6 or 7 "users" (the same one, but from different terminals), each one making tens of calls to the apply-workflow endpoint.
  3. We reviewed the option of using sqlalchemy scoped_sessions, and discarded it based on how fastapi works - see discussion in SQLAlchemy Dependency vs. Middleware vs. scoped_session fastapi/fastapi#726.
  4. It remains to be explored whether we can solve the sqlite/threading issue (as in Review use of threadsafety in sqlite #656).

Nothing left to do in this issue, closing.

@jluethi
Copy link
Collaborator

jluethi commented Apr 28, 2023

Thanks for the great overview & the deep dive into this @tcompa ! Very good to hear and additional motivation for me to try & get the postgres setup running at FMI :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
High Priority Current Priorities & Blocking Issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants