Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where persisting some events fails after unclean shutdown. #18137

Merged
merged 4 commits into from
Feb 5, 2025

Conversation

erikjohnston
Copy link
Member

Introduced in #18107

UniqueViolation: duplicate key value violates unique constraint "state_groups_persisting_pkey"

Stack trace
UniqueViolation: duplicate key value violates unique constraint "state_groups_persisting_pkey"
DETAIL:  Key (state_group, instance_name)=(975327903, event_persister-2) already exists.

  File "twisted/internet/defer.py", line 2010, in _inlineCallbacks
    result = context.run(
  File "twisted/python/failure.py", line 549, in throwExceptionIntoGenerator
    return g.throw(self.value.with_traceback(self.tb))
  File "synapse/util/caches/response_cache.py", line 265, in cb
    return await callback(*args, **kwargs)
  File "synapse/replication/http/send_events.py", line 164, in _handle_request
    await self.event_creation_handler.persist_and_notify_client_events(
  File "synapse/handlers/message.py", line 1911, in persist_and_notify_client_events
    ) = await self._storage_controllers.persistence.persist_events(
  File "synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
  File "synapse/storage/controllers/persist_events.py", line 428, in persist_events
    ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
  File "synapse/util/async_helpers.py", line 306, in yieldable_gather_results
    raise dfe.subFailure.value from None
  File "twisted/internet/defer.py", line 2010, in _inlineCallbacks
    result = context.run(
  File "twisted/python/failure.py", line 549, in throwExceptionIntoGenerator
    return g.throw(self.value.with_traceback(self.tb))
  File "synapse/storage/controllers/persist_events.py", line 423, in enqueue
    return await self._event_persist_queue.add_to_queue(
  File "synapse/storage/controllers/persist_events.py", line 245, in add_to_queue
    res = await make_deferred_yieldable(end_item.deferred.observe())
  File "synapse/storage/controllers/persist_events.py", line 288, in handle_queue_loop
    ret = await self._per_item_callback(room_id, item.task)
  File "synapse/storage/controllers/persist_events.py", line 369, in _process_event_persist_queue_task
    return await self._persist_event_batch(room_id, task)
  File "synapse/storage/controllers/persist_events.py", line 643, in _persist_event_batch
    async with self._state_deletion_store.persisting_state_group_references(
  File "contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
  File "synapse/storage/databases/state/deletion.py", line 228, in persisting_state_group_references
    await self.db_pool.runInteraction(
  File "synapse/storage/database.py", line 952, in runInteraction
    return await delay_cancellation(_runInteraction())
  File "twisted/internet/defer.py", line 2010, in _inlineCallbacks
    result = context.run(
  File "twisted/python/failure.py", line 549, in throwExceptionIntoGenerator
    return g.throw(self.value.with_traceback(self.tb))
  File "synapse/storage/database.py", line 918, in _runInteraction
    result: R = await self.runWithConnection(
  File "synapse/storage/database.py", line 1047, in runWithConnection
    return await make_deferred_yieldable(
  File "twisted/python/threadpool.py", line 269, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "twisted/python/threadpool.py", line 285, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "synapse/storage/database.py", line 1040, in inner_func
    return func(db_conn, *args, **kwargs)
  File "synapse/storage/database.py", line 780, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "synapse/storage/databases/state/deletion.py", line 258, in _mark_state_groups_as_persisting_txn
    self.db_pool.simple_insert_many_txn(
  File "synapse/storage/database.py", line 1194, in simple_insert_many_txn
    txn.execute_values(sql, values, fetch=False)
  File "synapse/storage/database.py", line 415, in execute_values
    return self._do_execute(
  File "synapse/storage/database.py", line 488, in _do_execute
    return func(sql, *args, **kwargs)
  File "synapse/storage/database.py", line 418, in <lambda>
    lambda the_sql, the_values: execute_values(
  File "psycopg2/extras.py", line 1299, in execute_values
    cur.execute(b''.join(parts))

@erikjohnston erikjohnston marked this pull request as ready for review February 5, 2025 15:54
@erikjohnston erikjohnston requested a review from a team as a code owner February 5, 2025 15:54
synapse/storage/databases/state/deletion.py Outdated Show resolved Hide resolved
@@ -95,8 +95,18 @@ def __init__(
self.db_pool = database
self._instance_name = hs.get_instance_name()

# TODO: Clear from `state_groups_persisting` any holdovers from previous
# running instance.
with db_conn.cursor(txn_name="resolve_sliding_sync") as txn:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the txn_name probably should be something more aligned with state deletion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you insist

@erikjohnston erikjohnston requested a review from devonh February 5, 2025 15:56
@erikjohnston erikjohnston changed the base branch from develop to release-v1.124 February 5, 2025 16:11
@erikjohnston erikjohnston merged commit 3391da3 into release-v1.124 Feb 5, 2025
37 of 39 checks passed
@erikjohnston erikjohnston deleted the erikj/check_restart branch February 5, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants