-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collation upgrade safety #89
Comments
connections while the index happens are open, this is correct. it is mostly convience right now, since running the index operation also requires a Postgres server to run, it is easier when can start the main process. not sure how complicated it will be to start a Postgres server that is only bound to the socket. and likely we also want a flag to opt-out of this behaviour, because otherwise I can see complaints. you specifically talked about Then there is the potential for people who use the "one shot" mode that the is this the same issue on Alpine with |
As @andyundso points out, if this is something we can detect then we'll have options. 😄
That's doable by launching the docker-pgautoupgrade/docker-entrypoint.sh Line 526 in 82fa7ad
It's not as fully featured as the full database server, but it'd do the job if needed. The problem is that for large databases reindexing can take quite some time (even hours). Doing the reindexing while the database is online and available gives much better availability than using a single-connection-PostgreSQL to do it. |
Context
Normally, when the system version of glibc (the GNU C library) is upgraded, this can essentially lead to Postgres' indices becoming corrupt.
Specifically, new versions of glibc can change the ordering rules in collations.
Because the indices in Postgres databases rely on these ordering rules to be consistent (e.g. a b-tree index stores the index rows in order), if the ordering rules change then the index starts to give wrong results, e.g. unique indices now start allowing some duplicate rows, because when a search is performed in the index, it might not find the row that was already there.
(This is not just an academic risk; I have had bug reports from people where this has occurred; as a result some software that I work on at work will refuse to use any collation other than
C
since the sorting rules for that one won't change.)Reindexing
It's nice to see that
docker-pgautoupgrade
already does this when upgrading the Postgres version, but the safe and right solution to prevent this corruption is to rebuild (reindex) all the indices when upgrading glibc version.(In the containerised builds of Postgres, I imagine the glibc version will only be changed when the Postgres version is, so this seems reasonable)
However, it appears that
docker-pgautoupgrade
will allow application connections whilst the rebuild is taking place (though please do correct me if I'm wrong, and advertise this safety aspect in the readme :D). So, essentially, there is a window of opportunity for the application to cause corruption before the index has been rebuilt.I guess many users would prefer the reduced downtime given by the current solution, but blocking application connections before the rebuilds have been done seems like the safest approach. I could also imagine users opting-in to the more 'dangerous' approach if they really wanted.
I'm curious to know the perspectives on this issue; for me this is always a prickly part of operating Postgres.
The text was updated successfully, but these errors were encountered: