Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

feat(51/WAKU2-RELAY-SHARDING): Autosharding update #607

Merged
merged 20 commits into from
Aug 17, 2023
Merged

Conversation

SionoiS
Copy link
Contributor

@SionoiS SionoiS commented Jul 10, 2023

The end result is a design that integrate well with the current specs. It doesn't try to do many thing. It basically pick a random shard per app. Picking a random shard IMO is the best we can do for now.

Generations spec. and shard allocations are out of scope. Each generations specification will have a RFC and modifying shard allocation is something we are considering.


old stuff

Hello,

Opening this PR as a conversation starter on the topic of automatic sharding (scaling) the Waku network.
I had discussions on the topic already with @kaiserd and @alrevuelta

There are problems with every path we could take form here.

  • No nothing. Free for all, apps pick the shards they want. There is no Waku network...
  • Use consistent hashing. These methods can only assign topics to shards evenly. Apps will want their topics on the same shards otherwise the overhead will be massive. Also, it does not distribute traffic, except if we step in as a central authority and start regulating (more like advising since we can't enforce anything).
  • Design a new system. Is it even possible? If yes, do we want to spend time (multiple years IMO) on this endeavor?

Discovery: We have no solution, Discv5 is not built for a network with many different apps/protocol/capabilities (yet).

My own notes

@SionoiS SionoiS marked this pull request as ready for review July 31, 2023 13:53
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good start! I've added some comments below, mostly related to clarity. In general we want RFCs to be as explicit as possible, use full sentences and require as little implicit context as possible. May also be useful to think along the lines of the RFC keywords (MUST, SHOULD, MAY), etc. and formulate each section according to what is absolutely mandatory, recommended practice or just an option for spec application.

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 31, 2023

I really need to get used to this style of writing! I going to read some guides.

@SionoiS SionoiS requested a review from jm-clius August 1, 2023 19:28
@SionoiS SionoiS self-assigned this Aug 2, 2023
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Couple of minor comments below.

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fryorcraken fryorcraken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback but also some questions as I am not sure to understand the role of clusters in autosharding.

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I've added a few comments.
Sth that I'm missing is the concept of "network type". I mean, I'm missing to reflect somewhere the seggretation between the mainnet and the testnet.

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments. as far as i understood this is the spec for waku-org/nwaku#1854, so perhaps we should had the spec ready before doing the actual implementation? or am i missing something?

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
@SionoiS
Copy link
Contributor Author

SionoiS commented Aug 8, 2023

I removed the generation specification as this will end up in another RFC. I also removed bias.

@SionoiS SionoiS requested a review from jm-clius August 8, 2023 15:21
@fryorcraken
Copy link
Contributor

left some comments. as far as i understood this is the spec for waku-org/nwaku#1854, so perhaps we should had the spec ready before doing the actual implementation? or am i missing something?

With the recent re-org and growth of research and nwaku team. I'd recommend to formalize this process in some way and clarify expectations (e.g. RFC are merged before code is merged or vice versa).

content/docs/rfcs/51/README.md Show resolved Hide resolved
content/docs/rfcs/51/README.md Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
@SionoiS
Copy link
Contributor Author

SionoiS commented Aug 10, 2023

I changed the algo. to hash MOD n and update the example.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This LGTM now and thanks for your patience here while we figure out details.

Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

As discussed offline, all apps mapping to the same shard can have some problems:

  • easier to attach one app, since it would be enough to attack one shard.
  • more difficult to spread all the topics evenly across shards. similar to the knapsack problem where the backpack is 20kg and your books are all 18kg. (kg = bandwidth in waku).

ofc the main pro is that the bandwdith will be reduces, since the same app will have all its "rooms" "subchats" "subtopic" or whatever you call it in the same shard, hecne not requiring to be subcribed to multiple ones. see birthday paradox.

@SionoiS SionoiS merged commit 4888acc into master Aug 17, 2023
@SionoiS SionoiS deleted the feat-autosharding branch August 17, 2023 12:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants