-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove a random subset of validators in net_dynamic_hb #385
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far! 👍
I'd just like to make the test slightly more general (see comment at subset_for_remove
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! (Just one more nit-pick below.)
Running tests on the CI server caught an error:
Trying to reproduce the error locally I caught another error:
Second error appears when only one node left in the network after removing nodes. I'm trying to figure out what's wrong. |
@RicoGit, to reproduce locally, add the proptest seed from the failing test run on Travis to your local seeds. |
Should I add a static test case for the failed inputs? like this:
|
No, just add the seed to |
I've got it. Awesome! Thanks! But local test (with the seed from CI) produces a different error. |
Have a look at crank error variants if your local failure is |
Does the second error always happen whenever we're left with only a single node? The first error is strange, though. We are a bit aggressive with our fault reports and sending the same |
"restricting the test to at least two nodes for now." - Yes! It fixed all the tests. Thanks! I've run a test about ten times for all known seeds. |
Great, thanks! |
@afck, are you OK with versioning |
Ah, right, that file should be removed, since with a single node it always fails anyway. Good catch! |
related issue #374
do_drop_and_re_add
chooses at random at least 1 node for removing from the network and then re-adding removed nodes back. Cluster always remain correct. In other words before and after removing validators network correctness condition (N=3f+1) is always satisfied.Removed nodes can be as faulty nodes as well as correct ones. You can see in the log of the test how much is actually nodes will be removed.