Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

feat(51/WAKU2-RELAY-SHARDING): Autosharding update #607

Merged
merged 20 commits into from
Aug 17, 2023
146 changes: 90 additions & 56 deletions content/docs/rfcs/51/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ category: Standards Track
tags:
editor: Daniel Kaiser <[email protected]>
contributors:
- Simon-Pierre Vivier <[email protected]>
---

# Abstract
Expand Down Expand Up @@ -38,10 +39,6 @@ This document also covers discovery of topic shards.
It is RECOMMENDED for App protocols to follow the naming structure detailed in [23/WAKU2-TOPICS](/spec/23/).
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
With named sharding, managing discovery falls into the responsibility of apps.

The default Waku pubsub topic `/waku/2/default-waku/proto` can be seen as a named shard available to all app protocols.



From an app protocol point of view, a subscription to a content topic `waku2/xxx` on a shard named /mesh/v1.1.1/xxx would look like:

`subscribe("/waku2/xxx", "/mesh/v1.1.1/xxx")`
Expand All @@ -56,14 +53,13 @@ Static shards are managed in shard clusters of 1024 shards per cluster.
Waku static sharding can manage $2^16$ shard clusters.
Each shard cluster is identified by its index (between $0$ and $2^16-1$).

A specific shard cluster is either globally available to all apps (like the default pubsub topic),
A specific shard cluster is either globally available to all apps,
specific for an app protocol,
or reserved for automatic sharding (see next section).

> *Note:* This leads to $2^16 * 1024 = 2^26$ shards for which Waku manages discovery.

App protocols can either choose to use global shards, or app specific shards.
(In future versions of this document, automatic sharding, described in the next section, will become the default.)

Like the [IANA ports](https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml),
shard clusters are divided into ranges:
Expand Down Expand Up @@ -173,45 +169,99 @@ This example node is part of shards `13`, `14`, and `45` in the Status main-net

# Automatic Sharding

> *Note:* Automatic sharding is not yet part of this specification.
This section merely serves as an outlook.
A specification of automatic sharding will be added to this document in a future version.

Automatic sharding is a method for scaling Waku relay in the number of (smaller) content topics.
It automatically maps Waku content topics to pubsub topics.
Clients and protocols building on Waku relay only see content topics, while Waku relay internally manages the mapping.
This provides both scaling as well as removes confusion about content and pubsub topics on the consumer side.

From an app point of view, a subscription to a content topic `waku2/xxx` using automatic sharding would look like:

`subscribe("/waku2/xxx", auto=true)`

The app is oblivious to the pubsub topic layer.
(Future versions could deprecate the default pubsub topic and remove the necessity for `auto=true`.)
Autosharding selects shards automatically and is the default behaviour for shard choice.
The other choices being static and named sharding as seen in previous sections.
Shards (pubsub topics) MUST be computed from content topics with the procedure below.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

## Rendezvous Hashing

Also known as the Highest Random Weight (HRW) method, has many properties useful for shard selection.
- Distributed agreement. Local only computation without communication.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
- Load balancing. Content topics are spread as equally as possible to shards.
- Minimal disruption. Content topics switch shards infrequently.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
- Low overhead. Easy to compute.

### Algorithm

For each shard,
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
hash using Sha2-256 the concatenation of
the content topic `application` field (N UTF-8 bytes),
`version` (N UTF-8 bytes),
the `cluster` index (2 bytes) and
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
the `shard` index (2 bytes)
take the first 64 bits of the hash,
divided by 2^64,
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
compute the natural logarithm of this number then
take the negative of the weight (default 1.0) divided by it.
Finally, sort the shard value pairs.

The shard with the highest value MUST be used.

### Example
| Field | Value | Hex |
|--- |--- |--- |
| `application`| "myapp"| 0x6d79617070|
| `version` | "1" | 0x31
| `cluster` | 1 | 0x0001
| `shard` | 6 | 0x0006

- SHA2-256 of `0x6d796170703100010006` is `0xfdac5fb315b791cca3ccde738ebb0b8d2742519fce1086f2a3d2409cb22a7dd2`
- The first 64 bits `0xfdac5fb315b791cc` divided by `0xffffffffffffffff` is ~`0.99`
- The natual logarithm of `0.99` is ~`-0.01`
- `-1` divided by `-0.01` equals ~`99.5`
- Shard 6 has the priority value 99.5
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

## Content Topics Format for Autosharding
Content topics MUST follow the format in [23/WAKU2-TOPICS](https://rfc.vac.dev/spec/23/#content-topic-format).
In addition, 2 prefixes MAY be added to content topics.
Generation & bias.
When omitted default values are used.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
Generation default value is `0`.
Bias default value is `unbiased`.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

### Example

`/0/unbiased/myapp/1/mysub/cbor`

### Generation
The generation number monotonously increases and indirectly refers to the total number of shards of the Waku Network.

The first generation (zero) MUST use 8 shards in total.
The cluster index is 1 and the indices of each shards are numbered 0 to 7.

This document will be updated for future generation.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

### Bias
Bias is used to skew the priority of shards via weights.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
Other biases than `unbiased` are unspecified for now but may be used in the future.

### Topic Design
Content topics have 2 purposes filtering and routing.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
Filtering is done by changing the `subject` field.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
As this part is not hashed, it will not affect routing (shard selection).
The `application` and `version` fields do affect routing.
Using multiple content topics with different `application` field has advantages and disadvantages.
It increases the traffic a relay node is subjected to when subscribed to all topics.
It also allows relay and light nodes to subscribe to a subset of all topics.

## Problems

### Hot Spots

Hot spots occur (similar to DHTs), when a specific mesh network (shard) becomes responsible for (several) large multicast groups (content topics).
SionoiS marked this conversation as resolved.
Show resolved Hide resolved
The opposite problem occurs when a mesh only carries multicast groups with very few participants: this might cause bad connectivity within the mesh.

*The basic idea behind automatic sharding*:
Content topics are mapped using [consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing).
Like with DHTs, the hash space is split into parts,
each covered by a Pubsub topic (mesh network) that carries content topics which are mapped into the respective part of the hash space.
The current autosharding method cannot solve this problem.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

There are (at least) two issues that have to be solved: *Hot spots* and *Discovery* (see next subsection).
A new version of autosharding based on network traffic measurements could be designed but would require further research and development.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

Hot spots occur (similar to DHTs), when a specific mesh network becomes responsible for (several) large multicast groups (content topics).
The opposite problem occurs when a mesh only carries multicast groups with very few participants: this might cause bad connectivity within the mesh.
Our research goal here is finding efficient ways of distribution.
We could get inspired by the DHT literature.
We also have to consider:
If a node is part of many content topics which are all spread over different shards,
the node will potentially be exposed to a lot of network traffic.

## Discovery
### Discovery

For the discovery of automatic shards this document specifies two methods (the second method will be detailed in a future version of this document).

The first method uses the discovery introduced above in the context of static shards.
The index range `49152 - 65535` is reserved for automatic sharding.
Each index can be seen as a hash bucket.
Consistent hashing maps content topics in one of these buckets.
Each cluster index can be seen as a hash bucket.
Rendezvous hashing maps content topics in one of these buckets.
SionoiS marked this conversation as resolved.
Show resolved Hide resolved

The second discovery method will be a successor to the first method,
but is planned to preserve the index range allocation.
Expand Down Expand Up @@ -239,25 +289,9 @@ We will add more on security considerations in future versions of this document.
## Receiver Anonymity

The strength of receiver anonymity, i.e. topic receiver unlinkablity,
depends on the number of content topics (`k`) that get mapped onto a single pubsub topic (shard).
depends on the number of content topics (`k`), as a proxy for the number of peers and messages, that get mapped onto a single pubsub topic (shard).
For *named* and *static* sharding this responsibility is at the app protocol layer.

## Default Topic

Until automatic sharding is fully specified, (smaller) Apps SHOULD use the default PubSub topic unless there is a good reason not to,
e.g. a requirement to scale to large user numbers (in a rollout phase, the default pubsub topic might still be the better option).

Using a single PubSub topic ensures a connected network, as well as some degree of metadata protection.
See [section on Anonymity/Unlinkability](/spec/10/#anonymity--unlinkability).

Using another pubsub topic might lead to

- weaker metadata protection
- connectivity problems if there are not enough nodes within the respective pubsub mesh
- store nodes might not store messages for the chosen pubsub topic

Apps that use named (not the default) or static sharding likely have to setup their own infrastructure nodes which may render the application less robust.

# Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
Expand All @@ -272,7 +306,7 @@ Copyright and related rights waived via [CC0](https://creativecommons.org/public
* [51/WAKU2-RELAY-SHARDING](/spec/51/)
* [Ethereum ENR sharding bit vector](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#metadata)
* [Ethereum discv5 specification](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md)
* [consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing).
* [Rendezvous Hashing](https://www.eecs.umich.edu/techreports/cse/96/CSE-TR-316-96.pdf)
* [Research log: Waku Discovery](https://vac.dev/wakuv2-apd)
* [45/WAKU2-ADVERSARIAL-MODELS](/spec/45)