spec: Support to the crash-recovery model #578
Labels
spec
Related to specifications
synchronization
Nodes' synchronization (different rounds, heights) issues
tracking
A complex isssue broken down into sub-problems.
Tendermint consensus algorithm is designed for the Byzantine model, where we should consider crash-recovery faulty behavior: when a process crashes (abruptly ceases its participation in the protocol) and later recovers (i.e., the crashed process is started again).
It is worth noticing that a process that crashes then recovers is not a Byzantine (faulty) process. In fact, the period of "absence" of the process can be theoretically modeled as a long period of asynchrony, during which the process does not produce any output (e.g., messages). However, as a correct process, once it rejoins the computation (recovers), it should operate consistently , i.e., act accordingly to the protocol, in a way that it is somehow indistinguishable from a process that has not crashed.
To achieve this behavior, a recovering process should re-join the protocol with the same state it had before crashing, which means that the portion of the process state that is relevant for the protocol operation must be persisted.
Definition of Done
The text was updated successfully, but these errors were encountered: