Incident Report: Aug 3, 2023 #1043
dyc3
announced in
Incident Reports
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Timeline
Cause
For an unknown reason, all write operations to redis never complete. This is only happening for production OTT, and not the staging branch.
Update 9:47 PM EST: Prod has started responding to HTTP requests again, but rooms aren't being created because of postgres connection problems.
Fix
The fix was not possible on OTT's side. Upstash support said that the fly machine that upstash was using to host OTT's redis db restarted, and failed to autoheal.
Follow up items
Maybe investigate how much effort it would take to manage our own Redis.
Beta Was this translation helpful? Give feedback.
All reactions