Discv5 Protocol: Add support for banning nodes #769

bhartnett · 2025-01-24T05:14:59Z

This PR implements banning of nodes in the Discv5 protocol. This allows the protocol to temporarily stop interacting with nodes that are misbehaving for various reasons. See the related task here: status-im/nimbus-eth1#2809

Currently there is only one ban in place:

Nodes are banned when providing an invalid response message for a given request message

For now the bans are rather short. They will become more useful if we implement some form of counter for each type of violation then after a certain number of events we can ban the node for a longer period of time. Banning a node for a single violation is too extreme which is why the current bans are short but having short bans isn't as beneficial when interacting with bad nodes.

I opted to only put these two bans in place because these were the only locations where we currently replace nodes in the routing table. There may be other reasons to ban nodes in which case we can add additional bans to either this PR or a future PR.

Expired bans are cleaned up in the background every 5 mins in one of the background loops.
The banning is configurable and can be enabled/disabled. When disabled the protocol should behave in exactly the same way as before this PR.

…anNode.

…hLoop.

bhartnett · 2025-01-24T07:30:18Z

@kdeme Are there any other bans which I should put in place for Discv5 that you are aware of?

kdeme · 2025-01-27T16:01:29Z

tests/p2p/test_discoveryv5_bannodes.nim

+  ../stubloglevel,
+  ./discv5_test_helper
+
+suite "Discovery v5 Ban Nodes Enabled Tests":


With this being a copy of the tests without banlist it isn't really clear what we are testing or what we are testing extra in each test with respect to having banlist on or off.

My goal here was to fully test that the existing functionality is still working both with and without the ban list being enabled. The existing tests have banNodes = false so that the ban list is not enabled. The new tests cover mostly the same scenarios except banNodes = true so the ban list is enabled. Some of the new tests were modified based on the changed expected behavior and verify the case when a node is banned etc so I believe all/most the scenarios are covered.

How would you suggest I improve this? Open to suggestions. I wasn't sure if these duplicated regression tests are necessary or not. I guess I could remove them and just add a few tests with node bans enabled to the existing tests.

Yes, I understand the intention. I wasn't aware that some of the test cases were modified, which confirms my point that it is not very clear what exactly is being tested. Which test cases are not exactly copy-pasted from the original tests?

I understand that there is some value in running (most of) the same tests again with the banNodes = true to verify if it doesn't break general usage, but I don't really like it much in this duplicated way:

It will be difficult to maintain, the tests are already pretty complicated.

It is not really clear what is being tested and in the future if something breaks it will not be immediatly clear what the behaviour should be in banNodes = true versus banNodes = false.

So I think it would be good to have some (perhaps simpler) more specific test cases that verify the banNodes = true versus false cases.

If we were to also run the existing tests in both scenarios, then I think they should be altered to be usable in both ways. But this is not mandatory for me. And I think this also depends on whether we want to remove the banNodes option in the future and always have it enable. I was thinking of eventually removing this option once it is well tested.

Yes I see your point. I'll clean up the duplicated tests and add a few more specific scenarios covering banning of nodes.

I've updated the tests and added a few test cases specifically covering banned nodes.

kdeme · 2025-01-27T16:10:02Z

eth/p2p/discoveryv5/protocol.nim

@@ -437,6 +461,10 @@ proc receive*(d: Protocol, a: Address, packet: openArray[byte]) =
    let packet = decoded[]
    case packet.flag
    of OrdinaryMessage:
+      if d.isBanned(packet.srcId):


An improvement for the future here could be to ban without actually doing the decryption of the message (only the header). But they way the decodePacket call is currently designed this is not really possible.

Yes I guess we could use the src-id in the authdata section of the packet header for ordinary messages and handshake messages.

kdeme · 2025-01-27T16:12:57Z

eth/p2p/discoveryv5/protocol.nim

@@ -464,6 +492,10 @@ proc receive*(d: Protocol, a: Address, packet: openArray[byte]) =
      else:
        debug "Timed out or unrequested whoareyou packet", address = a
    of HandshakeMessage:
+      if d.isBanned(packet.srcIdHs):


Idem above.

kdeme · 2025-01-27T16:14:23Z

eth/p2p/discoveryv5/protocol.nim

      discovery_message_requests_outgoing.inc(labelValues = ["invalid_response"])
      return err("Invalid response to find node message")
  else:
+    d.banNode(fromNode, NodeBanDurationNoResponse)


I think I'd prefer not to ban in this situation for now. Even with a 5min timer I'd like to understand the effects of it better. E.g. also in cases where the actual local node is overloaded.

Sure, I will remove it for now.

bhartnett · 2025-01-28T05:09:49Z

@kdeme Are there any other bans which I should put in place for Discv5 that you are aware of?

Currently I'm only putting bans in place when checking the response of ping/findNode/talkReq messages. I thought about adding bans when handing messages in other scenarios such as in receive when a packet fails to decode but because we use UDP, bans at such a low level probably aren't a good idea. I guess we could just go with the current bans and if any scenarios come up we could put more bans in place when it makes sense to do so.

kdeme · 2025-01-29T09:57:51Z

@kdeme Are there any other bans which I should put in place for Discv5 that you are aware of?

Currently I'm only putting bans in place when checking the response of ping/findNode/talkReq messages. I thought about adding bans when handing messages in other scenarios such as in receive when a packet fails to decode but because we use UDP, bans at such a low level probably aren't a good idea. I guess we could just go with the current bans and if any scenarios come up we could put more bans in place when it makes sense to do so.

Yeah, I think it is fine to start with those reasons to ban.

bhartnett added 11 commits January 20, 2025 21:56

Add banned nodes to routing table.

c0c3ba0

Add tests.

59b17d1

Fix test.

6cdd9cd

Reduce sleep time in test.

e6a542c

Cleanup expired bans from banned nodes table when adding a new ban.

52b72df

Make cleanupExpiredBans public so it can be called independently of b…

f672d72

…anNode.

Init bannedNodes field.

1e8110c

Filter out banned nodes in lookups and cleanup expired bans in refres…

e16360b

…hLoop.

Don't respond to messages from banned nodes.

10b1e4d

Minor fix.

218d626

Merge branch 'master' into discv5-ban-list

dd9962d

bhartnett requested a review from kdeme January 24, 2025 06:20

Prevent sending messages to banned nodes.

854f17c

Add Discv5 banned nodes enabled tests.

aab3ca0

kdeme reviewed Jan 27, 2025

View reviewed changes

Don't ban nodes on timeout.

c1d76ce

bhartnett added 2 commits January 29, 2025 20:00

Merge branch 'master' into discv5-ban-list

d522c71

Add specific tests covering banned nodes.

9f5c274

bhartnett requested a review from kdeme January 29, 2025 15:08

kdeme approved these changes Jan 30, 2025

View reviewed changes

bhartnett merged commit c640d3c into master Jan 30, 2025
18 checks passed

bhartnett deleted the discv5-ban-list branch January 30, 2025 11:28

bhartnett mentioned this pull request Jan 30, 2025

Add peer ban list to Fluffy status-im/nimbus-eth1#2809

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discv5 Protocol: Add support for banning nodes #769

Discv5 Protocol: Add support for banning nodes #769

bhartnett commented Jan 24, 2025 •

edited

Loading

bhartnett commented Jan 24, 2025

kdeme Jan 27, 2025

bhartnett Jan 28, 2025

kdeme Jan 29, 2025

bhartnett Jan 29, 2025

bhartnett Jan 29, 2025

kdeme Jan 27, 2025

bhartnett Jan 28, 2025

kdeme Jan 27, 2025

kdeme Jan 27, 2025

bhartnett Jan 28, 2025

bhartnett commented Jan 28, 2025 •

edited

Loading

kdeme commented Jan 29, 2025

Discv5 Protocol: Add support for banning nodes #769

Discv5 Protocol: Add support for banning nodes #769

Conversation

bhartnett commented Jan 24, 2025 • edited Loading

bhartnett commented Jan 24, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhartnett commented Jan 28, 2025 • edited Loading

kdeme commented Jan 29, 2025

bhartnett commented Jan 24, 2025 •

edited

Loading

bhartnett commented Jan 28, 2025 •

edited

Loading