Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Bail during start-up if there are old background updates. #16397

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/16397.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Enforce that old background updates have run when starting Synapse.
31 changes: 24 additions & 7 deletions docs/development/database_schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,20 +25,37 @@ updated. They work as follows:

* The Synapse codebase defines a constant `synapse.storage.schema.SCHEMA_VERSION`
which represents the expectations made about the database by that version. For
example, as of Synapse v1.36, this is `59`.
example, as of Synapse v1.36, this is `59`. This version should be incremented
whenever a backwards-incompatible change is made to the database format (normally
via a `delta` file)

* The Synapse codebase defines a constant `synapse.storage.schema.SCHEMA_COMPAT_VERSION`
which represents the minimum database versions the current code supports.
Whenever the Synapse code is updated to assume backwards-incompatible changes
made to the database format, `synapse.storage.schema.SCHEMA_COMPAT_VERSION` is also updated
so that administrators can not accidentally roll back to a too-old version of Synapse.
Comment on lines +32 to +36
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that defining this before db.schema_compat_version is a bit backwards.


* The database stores a "compatibility version" in
The database stores a "compatibility version" in
`schema_compat_version.compat_version` which defines the `SCHEMA_VERSION` of the
oldest version of Synapse which will work with the database. On startup, if
`compat_version` is found to be newer than `SCHEMA_VERSION`, Synapse will refuse to
start.

Synapse automatically updates this field from
`synapse.storage.schema.SCHEMA_COMPAT_VERSION`.
Synapse automatically updates `schema_compat_version.compat_version` from
`synapse.storage.schema.SCHEMA_COMPAT_VERSION` during start-up.

* Whenever a backwards-incompatible change is made to the database format (normally
via a `delta` file), `synapse.storage.schema.SCHEMA_COMPAT_VERSION` is also updated
so that administrators can not accidentally roll back to a too-old version of Synapse.
* The Synapse codebase defines a constant `synapse.storage.schema.BACKGROUND_UPDATES_COMPAT_VERSION`
which represents the earliest supported background updates.

On startup, if there exists any background update (via the
`background_updates.ordering` column) older than `BACKGROUND_UPDATES_COMPAT_VERSION`,
Synpase will refuse to start.
Comment on lines +50 to +52
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly many updates (especially older ones) are inserted without an ordering, which defaults to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.

I agree -- I'm unsure if that's a process thing or a database change. I guess we can update the column to not have a default so people at least need to provide one?


This is useful for adding delta files which assume background updates have
finished; overall maintenance of Synapse (by allowing for removal of code
supporting old background updates); among other things.
Comment on lines +54 to +56
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the realization today that this also lets us delete the handlers for older background updates, which convinces me more this is a reasonable approach.


`BACKGROUND_UPDATES_COMPAT_VERSION` must be < the latest [full schema dump](#full-schema-dumps).

Generally, the goal is to maintain compatibility with at least one or two previous
releases of Synapse, so any substantial change tends to require multiple releases and a
Expand Down
33 changes: 32 additions & 1 deletion synapse/storage/prepare_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,19 @@
Optional,
TextIO,
Tuple,
cast,
)

import attr

from synapse.config.homeserver import HomeServerConfig
from synapse.storage.database import LoggingDatabaseConnection, LoggingTransaction
from synapse.storage.engines import BaseDatabaseEngine, PostgresEngine, Sqlite3Engine
from synapse.storage.schema import SCHEMA_COMPAT_VERSION, SCHEMA_VERSION
from synapse.storage.schema import (
BACKGROUND_UPDATES_COMPAT_VERSION,
SCHEMA_COMPAT_VERSION,
SCHEMA_VERSION,
)
from synapse.storage.types import Cursor

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -80,6 +85,9 @@ class _SchemaState:
applied_deltas: Collection[str] = attr.ib(factory=tuple)
"""Any delta files for `current_version` which have already been applied"""

background_updates: Collection[Tuple[str, int]] = attr.ib(factory=tuple)
"""Any (pending) updates in the `background_updates` table."""

upgraded: bool = attr.ib(default=False)
"""Whether the current state was reached by applying deltas.

Expand Down Expand Up @@ -359,6 +367,7 @@ def _upgrade_existing_database(
"""
if is_empty:
assert not current_schema_state.applied_deltas
assert not current_schema_state.background_updates
else:
assert config

Expand Down Expand Up @@ -413,6 +422,24 @@ def _upgrade_existing_database(
start_ver += 1

logger.debug("applied_delta_files: %s", current_schema_state.applied_deltas)
logger.debug(
"pending background_updates: %s",
(name for name, ordering in current_schema_state.background_updates),
)

# Bail if there are any pending background updates from before the background schema compat version.
for update_name, ordering in sorted(
current_schema_state.background_updates, key=lambda b: b[1]
):
# ordering is an int based on when the background update was added:
#
# (schema version when added * 100) + (schema delta when added).
update_schema_version = ordering // 100
Copy link
Member

@richvdh richvdh Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that the fact that the ordering is (schema version when added * 100) + N is a bit of a loose, unenforced, convention.

In many ways I'd find it less confusing if, instead of using a COMPAT_VERSION following the same system as SCHEMA_VERSION etc, we instead had a MIN_PENDING_BACKGROUND_UPDATE_ORDER which we compared with the raw ordering of the pending updates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is definitely a bit loose, using the raw minimum might make more sense.

if update_schema_version < BACKGROUND_UPDATES_COMPAT_VERSION:
raise UpgradeDatabaseException(
"Database has old pending background updates for version %d: %s"
% (update_schema_version, update_name)
)

if isinstance(database_engine, PostgresEngine):
specific_engine_extension = ".postgres"
Expand Down Expand Up @@ -705,10 +732,14 @@ def _get_or_create_schema_state(
)
applied_deltas = tuple(d for d, in txn)

txn.execute("SELECT update_name, ordering FROM background_updates")
background_Updates = cast(Tuple[Tuple[str, int], ...], tuple(txn))

return _SchemaState(
current_version=current_version,
compat_version=compat_version,
applied_deltas=applied_deltas,
background_updates=background_Updates,
upgraded=upgraded,
)

Expand Down
14 changes: 14 additions & 0 deletions synapse/storage/schema/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,3 +133,17 @@
This value is stored in the database, and checked on startup. If the value in the
database is greater than SCHEMA_VERSION, then Synapse will refuse to start.
"""

BACKGROUND_UPDATES_COMPAT_VERSION = (
# The replace_stream_ordering_column from 6001 must have run.
61
Comment on lines +141 to +142
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs upgrade notes to let folks know to double check this has ran before upgrading?

)
Comment on lines +140 to +143
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking on it more, I wonder if this should not be just background updates, but also refuse to start if your database is older than this? Effectively making a minimum supported version to upgrade from?

"""Limit on how far the syanpse can be rolled forward without breaking db compat

This value is checked on startup against any pending background updates. If there
are any pending background updates less than BACKGROUND_UPDATES_COMPAT_VERSION, then
Synapse will refuse to start.

In order to work with *new* databases this *must* be smaller than the latest full
dump of the database.
"""