Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Even smarter full availability recovery from backers #3216

Open
sandreim opened this issue Feb 5, 2024 · 6 comments
Open

Even smarter full availability recovery from backers #3216

sandreim opened this issue Feb 5, 2024 · 6 comments
Labels
I5-enhancement An additional feature request. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.

Comments

@sandreim
Copy link
Contributor

sandreim commented Feb 5, 2024

Currently we have a static threshold for deciding when we fetch full availability from backers as a first strategy: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/node/network/availability-recovery/src/lib.rs#L81

We can improve this choice by taking into account multiple parameters:

  • PoV size
  • backing group size
  • recovery context (approval or dispute)

My suggestion would be to always fetch from backers first if backing group size is 4 or 5, no matter the size of the PoV when the context is approval. In a healthy network it should fine as we'd expect the validators to not be hammered by this (around only 48Mpbs which is only 10% of recommended specs )

For disputes we might want to be more conservative while for smaller backing groups we should determine a higher threshold.

@sandreim sandreim added I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. I5-enhancement An additional feature request. labels Feb 5, 2024
@burdges
Copy link

burdges commented Feb 6, 2024

Why? We'd expect fewer first line sources to be a pessimization.

As a thought experement, imagine bittorrent being our transport:

We'd artificially have a torrent with one file for each availability chunk and a local tracker populated by our knowledge of upon backing votes, availability votes, and approval votes.

We'd employ partial torrent mode, initially select only the systematic chunks for download, and then flip to full torrent mode later once we have trouble. We only care here about the systematic chunks download phase for this conversation though.

How does bittorrent determine from whom it downloads which bittorrent chunks? It'll maybe bias towards more useful peers, but overall it cares little so long as they arrive and the hashes match. It definitely does not exclude peers just because they only hold one chunk.

We alraedy have open TLS+Noise connections between all validator pairs. Availability providers should hold systematic chunks in memory, exactly like backers do.

It's possible this optimizes over weaknesses elsewhere in our protocol stack, but if so then maybe those places should simply be fixed?

@sandreim
Copy link
Contributor Author

sandreim commented Feb 6, 2024

Why? We'd expect fewer first line sources to be a pessimization.

What I describe here is a cheap incremental optimization and tuning of parameters which is quite easy to implement.

As a thought experement, imagine bittorrent being our transport:

We'd artificially have a torrent with one file for each availability chunk and a local tracker populated by our knowledge of upon backing votes, availability votes, and approval votes.

We'd employ partial torrent mode, initially select only the systematic chunks for download, and then flip to full torrent mode later once we have trouble. We only care here about the systematic chunks download phase for this conversation though.

This is somehwat similar to what we have now in the sense that we try systematic chunk holders first and go "full torrent" later. A similar discussion is also in #575

It's possible this optimizes over weaknesses elsewhere in our protocol stack, but if so then maybe those places should simply be fixed?

If you meant the network stack , that is being taken care of with litep2p. Otherwise there seem to be other implementations of reed solomon which outperform our current one.

@burdges
Copy link

burdges commented Feb 6, 2024

What I describe here is a cheap incremental optimization and tuning of parameters which is quite easy to implement.

Why is it an optimization? Is it just because some choice is being made? Aka right now we're dong multiple things unecessarily or something?

@alindima
Copy link
Contributor

alindima commented Feb 7, 2024

What Andrei is suggesting is to request the full unencoded POV from the backing group (trying them one at a time), even for larger POVs. We currently only try this first for small POVs (under 128Kib).
It has the advantage that we don't need reed-solomon decoding (but so do systematic chunks) and that assuming the backing group is not overloaded and has large bandwidth, it'd be faster to make one large request instead of N/3 smaller ones to different validators (even if they're made in parallel).

Now, I'm not convinced it's a good idea to do this for all POV sizes. We should test this first

@sandreim
Copy link
Contributor Author

sandreim commented Feb 7, 2024

What Andrei is suggesting is to request the full unencoded POV from the backing group (trying them one at a time), even for larger POVs. We currently only try this first for small POVs (under 128Kib). It has the advantage that we don't need reed-solomon decoding (but so do systematic chunks) and that assuming the backing group is not overloaded and has large bandwidth, it'd be faster to make one large request instead of N/3 smaller ones to different validators (even if they're made in parallel).

Thanks @alindima, this is exactly what I mean.

Now, I'm not convinced it's a good idea to do this for all POV sizes. We should test this first
Yes this is the fine tuning part, but I expect it to be fine as described in the issue. Keep in mind that validators specs do require 500Mbit bandwidth, so based on some simple math you can see that only 10% of that is needed on average per block for each backer when the backing group is of size 5.

Using subsystem-bench has showed that we can do it and I'd also expect to be cheaper than systematic chunks since we're much more efficient with memory, not to mention network overhead. We have one big piece vs stiching all systematic chunks together in memory. But that is not the point. We must test this well

@burdges I don't really believe the current systematic chunks approach is tractable in practice if we assume validators haven't got enough bandwidth. Your suggested approach makes the assumption that we'd be faster fetching partial torrent-style vs fetching full availability from a single guy. I agree that would be the case for huge PoVs, 100MB maybe, but for just 5MB, I'd doubt this make any sense with 500Mbit/s network.

Systematic chunk recovery would work so much better if we'd have validators hold more than 1 chunk and also fetch them from approval checkers assuming we are later tranche or just slow and have seen other same tranche approve it.

@burdges
Copy link

burdges commented Feb 7, 2024

We currently only try this first for small POVs (under 128Kib).

I see, thanks. Yes good idea. :)

Your suggested approach makes the assumption that we'd be faster fetching partial torrent-style vs fetching full availability from a single guy. I agree that would be the case for huge PoVs, 100MB maybe, but for just 5MB, I'd doubt this make any sense with 500Mbit/s network.

Yeah, we maybe do not reequire the complexity here.

Systematic chunk recovery would work so much better if we'd have validators hold more than 1 chunk and also fetch them from approval checkers assuming we are later tranche or just slow and have seen other same tranche approve it.

We tune later tranches to be rare-ish, like maybe 15% (?) of total downloads. If we did torrent-like systematic downloads then yes approval checkers could collaboratively download like bittorrent swarms do, even during tranch zero.

We could've validators hold multiple chunks of course. Ignoring overhead our bandwidth looks like

  • 2-5 x for backers downloading full PoVs
  • 3 x for distributing chunks, and
  • 30 x for approval checkers downloading chunks or full

It'd total increase availability bandwidht by 5-8 % if we sent one systematic chunk to the 2/3rd of validators not holding systematic chunks. You could also do a second faster erasure code for those other chunks, like sending these validators the xor of some random systematic chunks.

As you say, our PoVs blocks remain small enough so that maybe all this looks like overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.
Projects
Status: Backlog
Development

No branches or pull requests

4 participants
@burdges @alindima @sandreim and others