PR#24571 - Prevent block index fingerprinting by sending additional getheaders messages 👀

A PR by Niklas Gögge to prevent block index fingerprinting by sending additional getheaders messages. Participants discussed about stale blocks, how the block index works, how current behavior allows for this fingerprinting technique and how this PR prevents that.

Questions

What is the block index and what is it used for? 🔗

An in-memory index of the blockchain which contains the headers plus info on where to find the rest of the block data on disk. It's kind of a blocktree, with multiple branches (chains) that allows us to determine the longest branch/chain and switching between branches if one becomes longer.

Why and how can the block index be used for fingerprinting? 🔗

The exact number of stale blocks any specific node has seen will be unique to that node depending on where in the network the node sits. A node's block index is unique based on the fact that it can contain stale blocks that other nodes do not have. I might have a branch in that tree that not many others have. If you can find out that I have this branch, this leaks information.
- You might have seen a header/block that for example after a reorg is no longer part of the main chain and has thus become stale. For me to have this stale block header in my block index, I must have believed that it was in my best chain at the time of acceptance.
- During IBD you only request and download the blocks of the headers you got during initial header sync. Newly synced nodes will not have stale blocks from before they were synced. Nodes only know about stale blocks because at the time they were relayed, from the perspective of the relayer, they weren't stale.
- Bitcoin Core imposes a STALE_RELAY_AGE_LIMIT on how far back stale blocks will be served to peers. A peer can ask for less-than-30-days-old stale blocks but only by knowning the hash of that block.
- But if someone sends you a valid header with enough work on it (more work than the block it is extending) then you will store it in the index, it does not need to extend the tip.
  - This allows for an attacker with a good collection of past stale headers (or able to mine new ones) to be able to mark nodes with specific headers that will sent to them.
  - The attacker sends the victim one of these headers, and then probe again, making the fingerprinting possible even if the victim doesn't have any stale headers at the beginning.
If a peer can probe for stale blocks in the node's block index then it can use that information to fingerprint the node.

Why do we keep stale blocks in the block index? 🔗

[remains open question]
It makes sense to keep recent stale blocks to be able to reorg to that chain if it becomes longer, but the need to keep a year old stale blocks/headers is not so obvious.
Possible stackexchange answer by Pieter Wuille that mentions cheap to keep and incredibly expensive for a DoS attack.

In your own words, how does the fingerprinting technique outlined in the PR work 🔗

We extend a stale branch with header H+1, send H+1 to a node and see if it requests header H. If it does, it doesn't know about the stale branch. If it doesn't, it knows about the stale branch. The PR says H+1 doesn't need to have a valid PoW, so this is very cheap for us to do.
- Checking the PoW on the header to see if it's sufficient would make it harder on the attacker but not impossible.
To increase the accuracy of the attack you can have a list of existing stale blocks that our peer might have or not, create bogus headers building on them, and send them to a peer to check and record for which of the headers we get a GETHEADERS in return.
The whole idea of this technique is that the the attacker would know of two addresses, say one IPv4 and one Tor, and could then, using this technique, figure out if the addresses belong to the same node.
- There is a Tor only option but it is common to run two different network connections from a single node, e.g IPv4 and IPv6.

This commit introduces a new parameter to `PeerManagerImpl::BlockRequestAllowed`. Why is that necessary? 🔗

With the new logic and without the allow_potentially_invalid_headers parameter, one of the p2p test failed. This test extends a recent non main chain branch with a header and expects the node to send a getdata for the block. But with the new logic the node would ignore the header because it extended a stale branch.
Specifically, pindex->IsValid(BLOCK_VALID_SCRIPTS) in PeerManagerImpl::BlockRequestAllowed always returns false for headers, so we need an exception for that if we are deciding if we should leak the info about a header.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-04-06-#24571.md

2022-04-06-#24571.md

PR#24571 - Prevent block index fingerprinting by sending additional getheaders messages 👀

Questions

What is the block index and what is it used for? 🔗

Why and how can the block index be used for fingerprinting? 🔗

Why do we keep stale blocks in the block index? 🔗

In your own words, how does the fingerprinting technique outlined in the PR work 🔗

This commit introduces a new parameter to `PeerManagerImpl::BlockRequestAllowed`. Why is that necessary? 🔗

Files

2022-04-06-#24571.md

Latest commit

History

2022-04-06-#24571.md

File metadata and controls

PR#24571 - Prevent block index fingerprinting by sending additional getheaders messages 👀

Questions

What is the block index and what is it used for? 🔗

Why and how can the block index be used for fingerprinting? 🔗

Why do we keep stale blocks in the block index? 🔗

In your own words, how does the fingerprinting technique outlined in the PR work 🔗

This commit introduces a new parameter to PeerManagerImpl::BlockRequestAllowed. Why is that necessary? 🔗

This commit introduces a new parameter to `PeerManagerImpl::BlockRequestAllowed`. Why is that necessary? 🔗