-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disconnect invalid and inactive peers #431
base: develop
Are you sure you want to change the base?
Conversation
gossip/handler.go
Outdated
useless = true | ||
|
||
// Some clients have compatible caps and thus pass discovery checks and seep in to | ||
// protocol handler. We should band these clients immediately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: little typo
gossip/handler.go
Outdated
@@ -61,6 +61,17 @@ const ( | |||
// txChanSize is the size of channel listening to NewTxsNotify. | |||
// The number is referenced from the size of tx pool. | |||
txChanSize = 4096 | |||
|
|||
// percentage of useless peer nodes to allow | |||
uselessPeerPercentage = 20 // 20% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why don't we use just a factor, e.g. 0.2, instead of then having to calculate each time the percentage?
gossip/handler.go
Outdated
|
||
// A useless peer is the one which does not support protocols opera/63 & fsnap/1. | ||
useless := !eligibleForSnap(p.Peer) | ||
if !p.Peer.Info().Network.Trusted && useless && h.peers.UselessNum() >= (h.maxPeers*(uselessPeerPercentage/100)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: I am not yet familiar with this useless stuff, but why do we even allow a percentage of useless peers at all? Why don't we just disconnect them all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the peer is useless in the context of sync, i.e. it doesn't support fsnap/1
and opera/63
.
But old peers supporting opera/62
should still be allowed to participate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so I assume useless
then already checked that the peer is a opera/62
peer. It's not just any peer. That would make sense.
gossip/handler.go
Outdated
return err | ||
// progress and application | ||
progressWatchDogTimer := time.NewTimer(noProgressTime) | ||
applicationWatchDogTimer := time.NewTimer(noAppMessageTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we recreating the timer on each for
iteration here? Therefore the Reset
s later are useless? It looks to me that either we have to create the timers outside of the for loop, and then Reset
them as you do now, or recreating them in each loop iteration and just break when we Reset
, although this then results in a lot of garbage collected timers? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops... the timer should be outside the loop.
err := h.handleMsg(p) | ||
if err != nil { | ||
p.Log().Debug("Message handling failed", "err", err) | ||
if strings.Contains(err.Error(), errorToString[ErrPeerNotProgressing]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use errors.Is
here instead of comparing strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use errors.Is()
only to compare errors. But in this place, the error is defined as a string.
If we want to change it, we should define all the errors as errors.New().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agreed. If there are more such string based errors instead of errors.New()
based ones (which I believe would be better) - then this should go into a separate PR to address. So up to you if you want to do anything in this PR.
gossip/handler.go
Outdated
@@ -1014,6 +1070,10 @@ func (h *handler) handleMsg(p *peer) error { | |||
return errResp(ErrDecode, "%v: %v", msg, err) | |||
} | |||
p.SetProgress(progress) | |||
// If peer has not progressed for noProgressTime minutes, then disconnect the peer. | |||
if !p.IsPeerProgressing() { | |||
return errResp(ErrPeerNotProgressing, "%v: %v %v", "epoch is not progressing for ", noProgressTime, "minutes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: As noProgressTime
is a duration, this would print "epoch is not progressing for 3m0s minutes", I think
@@ -1316,6 +1376,11 @@ func (h *handler) handleMsg(p *peer) error { | |||
default: | |||
return errResp(ErrInvalidMsgCode, "%v", msg.Code) | |||
} | |||
|
|||
if msg.Code != ProgressMsg { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not yet familiar with all message codes, but is ProgressMsg
the only message which signals that there is progress?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
p.progress = x | ||
p.progressTime = time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason why p.appMessageTime
is locked, but p.progressTime
isn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's locked in SetProgress()
where setPeerAsProgressing()
is called.
gossip/peer_test.go
Outdated
newPeer := getPeer() | ||
ep1 := PeerProgress{Epoch: 1} | ||
newPeer.SetProgress(ep1) | ||
time.Sleep(2 * time.Second) //set the threshold to 2 second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these Sleep
acctumulate to 9 seconds - making test runs 9 seconds slower as I understand. Isn't there a different way to test this? Do we actually even need to sleep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I should already be the only person approving, but I want to signal that this looks good to me now (at least).
This PR adds checks to identify and ban peers that pass the P2P handshake and are accepted into the application protocol but has other application-level issues.
These checks have shown that peers that are valid and working honestly get priority.
Depends on Fantom-foundation/go-ethereum#44