Skip to content

Commit

Permalink
add figures, revise neighborhoods description, move PSS and ACT to to…
Browse files Browse the repository at this point in the history
…p level sidebar
  • Loading branch information
NoahMaizels committed Nov 7, 2024
1 parent 71880e8 commit 4395cb0
Show file tree
Hide file tree
Showing 14 changed files with 81 additions and 48 deletions.
11 changes: 11 additions & 0 deletions docs/concepts/DISC/DISC.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ title: DISC
id: disc
---

import bos_fig_2_7 from '/static/img/bos_fig_2_7.jpg';


DISC (Distributed Immutable Storage of Chunks) is a storage solution developed by Swarm based on a modified implementation of a [Kademlia DHT](/docs/concepts/DISC/kademlia) which has been specialized for data storage. Swarm's implementation of a DHT differs significantly in that it stores the content in the DHT directly, rather than just storing a list of seeders who are able to serve the content. This approach allows for much faster and more efficient retrieval of data.

### Kademlia Topology and Routing
Expand All @@ -25,6 +28,14 @@ In the DISC model, chunks are the canonical unit of data. When a file is uploade

Content-addressed chunks are chunks whose address is based on the hash digest of their data. Using a hash as the chunk address makes it possible to verify the integrity of chunk data. Swarm uses the BMT hash function based on a binary Merkle tree over small segments of the chunk data. A content-addressed chunk has an at most 4KB payload, and its address is calculated as the hash of the span (chunk metadata) and the Binary Merkle Tree hash of the payload.

<div style={{ textAlign: 'center' }}>
<img src={bos_fig_2_7} className="responsive-image" />
<p style={{ fontStyle: 'italic', marginTop: '0.5rem' }}>
Source: <a href="https://www.ethswarm.org/the-book-of-swarm-2.pdf#subsection.2.2.2" target="_blank">The Book of Swarm - Figure 2.7 - "Content addressed chunk"</a>
</p>
</div>


For single-owner chunks on the other hand, the address is calculated as the hash of a unique id and the owner's overlay address. The content consists of an arbitrary data payload along with required headers. Unlike a content-addressed chunk, the contents of a single-owner chunk may be updated while the address remains unchanged. Single owner chunks form the basis for feeds, which are data structures that allow for mutable content with a static address.

### Push-Sync, Pull-Sync, and Retrieval Protocols
Expand Down
10 changes: 10 additions & 0 deletions docs/concepts/DISC/kademlia.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ title: Kademlia
id: kademlia
---

import bos_fig_2_3 from '/static/img/bos_fig_2_3.jpg';


Kademlia is a distributed hash table (DHT) algorithm used in peer-to-peer networks to efficiently store and retrieve data without relying on centralized servers. It organizes nodes into an overlay network that ensures efficient routing using a binary tree structure.

Expand Down Expand Up @@ -50,6 +52,14 @@ In contrast, Swarm makes use of forwarding Kademlia. Here each node forwards the

The main advantage of forwarding Kademlia is that it maintains the anonymity of the node which initiated the request.

<div style={{ textAlign: 'center' }}>
<img src={bos_fig_2_3} className="responsive-image" />
<p style={{ fontStyle: 'italic', marginTop: '0.5rem' }}>
Source: <a href="https://www.ethswarm.org/the-book-of-swarm-2.pdf#subsection.2.1.3" target="_blank">The Book of Swarm - Figure 2.3 - "Iterative and Forwarding Kademlia routing"</a>
</p>
</div>


### Neighborhood Based Storage Incentives

Swarm introduces a storage incentives layer on top of its Kademlia implementation in order to reward nodes for continuing to provide resources to the network. Neighborhoods play a key role in the storage incentives mechanism. Storage incentives take the role of a "game" in which nodes play to win a reward for storing the correct data. Each round in the game, one neighborhood is chosen to play, and all nodes within the same neighborhood participate as a group. The nodes each compare the data they are storing with each other to make sure they are all storing the data they are responsible for, and one node is chosen to win from among the group. You can read more about how storage incentives work in the dedicated page for storage incentives.
24 changes: 16 additions & 8 deletions docs/concepts/DISC/neighborhoods.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,30 @@ title: Neighborhoods
id: neighborhoods
---

In Swarm, a neighborhood refers to an area of responsibility within the network, where nodes in proximity to one another share the task of storing and maintaining data chunks. It is defined by the [proximity order (PO)](/docs/references/glossary#proximity-order-po) of nodes' addresses. Nodes within a neighborhood replicate data chunks to ensure that if one node goes offline, other nodes in the neighborhood can still retrieve and serve the content.

In Swarm, a neighborhood refers to an area of responsibility within the network, where nodes in proximity to one another share the task of storing and maintaining data chunks. Nodes within a neighborhood replicate chunks to ensure that if one node goes offline, other nodes in the neighborhood can still retrieve and serve the content.

:::info
To see neighborhood populations and the current storage depth / storage radius navigate to the ["Neighborhoods" page of Swarmscan.io](https://swarmscan.io/neighborhoods).
To see current neighborhood populations and the current storage depth / storage radius navigate to the ["Neighborhoods" page of Swarmscan.io](https://swarmscan.io/neighborhoods).
:::

:::info
The terms "depth" and "radius" are often used interchangeably when discussing neighborhoods. Both refer to number of shared leading bits of node and chunk addresses used to determine the nodes and chunks which fall into which neighborhoods.
:::

## Neighborhood Formation
## Key Concepts

### Proximity Order (PO)
The PO is a measure how close a node is to a particular chunk of data or other node. It is defined as the number of shared leading bits between two addresses. Nodes with equal or greater proximity order value relative to a chunk have equal responsibility for storing and maintaining it. Proximity order plays a role in how neighborhoods are defined, as a node’s neighborhood extends up to its storage depth, covering all nodes within that proximity​.

### Storage Depth

Storage depth represents the effective reach or area up measured in proximity order to which a node must synchronize and store chunks within its neighborhood. It is the proximity order of chunks for which a node is responsible for storing.

### Neighborhood

A Swarm neighborhood is determined by the proximity order (PO) of node addresses, which is calculated based on the number of leading bits shared between the addresses of nodes in the network. Nodes in the same neighborhood share the same prefix of their addresses, and the neighborhood expands or contracts depending on the availability of nearby nodes. As a result, each node is responsible for interacting with other nodes within its neighborhood to store and replicate data chunks, ensuring data availability and redundancy. The neighborhood depth dynamically adjusts as peers join or leave the network, maintaining a healthy distribution of storage responsibility across nodes.
A neighborhood is a set of nodes in close proximity to each other based on their Kademlia proximity order (PO). Each node in the network has a neighborhood determined by its storage depth, which defines the radius or boundary of responsibility for storing chunks. Each node in a neighborhood is responsible for interacting with other nodes within its neighborhood to store and replicate data chunks, ensuring data availability and redundancy.

### Example neighborhood
## Example neighborhood

Let's take a closer look at an example. Below is a neighborhood of six nodes at depth 10. Each node is identified by its Swarm address, which is a 256 bit hexadecimal number derived from the node's Gnosis Chain address, the Swarm network id, and a random nonce.

Expand All @@ -40,9 +48,9 @@ Since we are only concerned with the leading binary bits close to the neighborho
| da7b | <u>1101101001</u>111011|
| da7f | <u>1101101001</u>111111|

### Chunk neighborhood Assignment
### Area of Responsibility

Chunks are assigned to neighborhoods based on their addresses, which are in the same 256 bit format as node addresses. Here are two example chunks which fall within our example neighborhood:
Storer nodes are responsible for storing chunks with addresses whose leading btis match their own up to the storage depth. Here are two example chunks which fall within our example neighborhood:

> Chunk A address: `da49a42926015cd1e2bc552147c567b1ca13e8d4302c9e6026e79a24de328b65`
> Chunk B address: `da696a3dfb0f7f952872eb33e0e2a1435c61f111ff361e64203b5348cc06dc8a`
Expand Down
File renamed without changes.
File renamed without changes.
47 changes: 21 additions & 26 deletions docs/concepts/what-is-swarm.md → docs/concepts/what-is-swarm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,44 +3,37 @@ title: What is Swarm?
id: what-is-swarm
---

import bos_fig_1_1 from '/static/img/bos_fig_1_1.jpg';

# What is Swarm?

The complete vision of Swarm is described in detail in [The Book of Swarm](https://www.ethswarm.org/the-book-of-swarm-2.pdf) written by Swarm founder Viktor Tron, with further high level details described in the [whitepaper](https://papers.ethswarm.org/p/whitepaper/). More in depth low level implementation details can be found in the [Swarm Specification paper](https://papers.ethswarm.org/p/swarm-specification/). To stay up to date with all the latest research and technical papers from Swarm, make sure to bookmark the [Papers section](https://papers.ethswarm.org/) of the Ethswarm homepage.
The complete vision of Swarm is described in detail in [The Book of Swarm](https://papers.ethswarm.org/p/book-of-swarm/) written by Swarm founder Viktor Tron, with further high level details described in the [whitepaper](https://papers.ethswarm.org/p/whitepaper/). More in depth low level implementation details can be found in the [Swarm Specification paper](https://papers.ethswarm.org/p/swarm-specification/). The latest research and technical papers from Swarm can be found on the ["Papers" section](https://papers.ethswarm.org/) of the Ethswarm homepage.

Swarm is peer-to-peer network of nodes which work together to provide decentralised storage and communication infrastructure.

Swarm can be divided into four main parts:

1. Underlay Network - A peer-to-peer network protocol to serve as underlay transport.
2. Overlay Network - An overlay network with protocols powering a distributed immutable storage of chunks (fixed size data blocks).
1. Underlay Network - A peer-to-peer network protocol to serve as underlay transport. Swarm's underlay network is built with [libp2p](https://libp2p.io/).
2. Overlay Network - An overlay network with protocols powering a distributed immutable storage of chunks (fixed size data blocks).
3. Data Access Layer - A component providing high-level data access and defining APIs for base-layer features.
4. Application Layer - An application layer defining standards and outlining best practices for more elaborate use cases.

<div style={{ textAlign: 'center' }}>
<img src={bos_fig_1_1} className="responsive-image" />
<p style={{ fontStyle: 'italic', marginTop: '0.5rem' }}>
Source: <a href="https://www.ethswarm.org/the-book-of-swarm-2.pdf#part.2" target="_blank">The Book of Swarm - Figure 1.1 - "Swarm’s Layered Design"</a>
</p>
</div>



Of these four main parts, parts 2 and 3 form the core of Swarm.

### 1. Underlay Network

The first part of Swarm is a peer-to-peer network protocol that serves as the underlay transport. The underlay transport layer is responsible for establishing connections between nodes in the network and routing data between them. It provides a low-level communication channel that enables nodes to communicate with each other directly, without relying on any centralised infrastructure.

Swarm is designed to be agnostic of the particular underlay transport used, as long as it satisfies certain requirements.

1. Addressing – Nodes are identified by their underlay address.
2. Dialling – Nodes can initiate a direct connection to a peer by dialing them on
their underlay address.
3. Listening – Nodes can listen to other peers dialing them and can accept incoming
connections. Nodes that do not accept incoming connections are called light
nodes.
4. Live connection – A node connection establishes a channel of communication which
is kept alive until explicit disconnection, so that the existence of a connection
means the remote peer is online and accepting messages.
5. Channel security – The channel provides identity verification and implements
encrypted and authenticated transport resisting man in the middle attacks.
6. Protocol multiplexing – The underlay network service can accommodate several
protocols running on the same connection.
7. Delivery guarantees – Protocol messages have guaranteed delivery, i.e. delivery
failures due to network problems result in direct error response. Order of delivery
of messages within each protocol is guaranteed.
8. Serialisation – The protocol message construction supports arbitrary data structure
serialisation conventions.
Swarm is designed to be agnostic of the particular underlay transport used, as long as it satisfies certain requirements described in The Book of Swarm.

As the [libp2p](https://libp2p.io/) library meets all these requirements it has been used to build the Swarm underlay network.

Expand All @@ -51,17 +44,19 @@ The second part of Swarm is an overlay network with protocols powering the [Dist

Swarm's overlay network is built on top of the underlay transport layer and uses [Kademlia](/docs/concepts/DISC/kademlia) overlay routing to enable efficient and scalable communication between nodes. Kademlia is a distributed hash table (DHT) algorithm that allows nodes to locate each other in the network based on their unique identifier or hash.

Swarm's DISC is an implementation of a Kademlia DHT optimized for storage. While the use of DHTs in distributed data storage protocols is common, for many implementations DHTs are used only for indexing of specific file locations. Swarm's DISC distinguishes itself from other implementations by instead breaking files into chunks and storing the chunks themselves directly within a Kademlia DHT.
Swarm's DISC is an implementation of a Kademlia DHT optimized for storage. While the use of DHTs in distributed data storage protocols is common, for many implementations DHTs are used only for indexing file references. Swarm's DISC distinguishes itself from other implementations by instead breaking files into chunks and storing the chunks themselves directly within the DHT.

Each chunk has a fixed size of 4kb and is distributed across the network using the DISC model. Each chunk has a unique address taken from the same namespace as the network node addresses that allows it to be located and retrieved by other nodes in the network.

Swarm's distributed immutable storage provides several benefits, including data redundancy, tamper-proofing, and fault tolerance. Because data is stored across multiple nodes in the network, it can be retrieved even if some nodes fail or go offline.
Swarm's distributed immutable storage provides several benefits, including data redundancy, tamper-proofing, and fault tolerance. Because data is stored across multiple nodes in the network, it can be retrieved even if some nodes fail or go offline.

Built on top of the overlay network is also an [incentives layer](/docs/concepts/incentives/overview) which guarantees that node operators which share their resources with the network are fairly rewarded for their services.

### 3. Data Access Layer

The third part of Swarm is a component that provides high-level data access and defines APIs for base-layer features. This layer is responsible for providing an easy-to-use interface for developers to interact with Swarm's underlying storage and communication infrastructure.

Swarm's high-level data access component provides APIs that allow developers to perform various operations on the network, including [uploading and downloading data](/docs/develop/access-the-swarm/upload-and-download) and searching for content. These APIs are designed to be simple and intuitive, making it easy for developers to build decentralised applications on top of Swarm.
Swarm's high-level data access component provides [APIs that allow developers to perform various operations](/api/) on the network, including [uploading and downloading data](/docs/develop/access-the-swarm/upload-and-download) and searching for content. These APIs are designed to be simple and intuitive, making it easy for developers to build decentralised applications on top of Swarm.

### 4. Application Layer

Expand Down
2 changes: 1 addition & 1 deletion docs/develop/access-the-swarm/upload-and-download.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ When you upload your files to the Swarm, they are split into 4kb
_chunks_ and then distributed to nodes in the network that are
responsible for storing and serving these parts of your content.
To learn more about how Swarm's decentralized storage solution works,
check out the ["Learn" section](/docs/concepts/what-is-swarm).
check out the ["Concepts" section](/docs/concepts/what-is-swarm).

In order for you to be able to upload any data to the network,
you must first purchase [postage stamps](/docs/concepts/incentives/postage-stamps)
Expand Down
2 changes: 1 addition & 1 deletion docs/develop/tools-and-features/access-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ id: act
---

:::info
This is guide contains a detailed explanation of how to use the ACT feature, but does not cover its higher level concepts. To better understand how ACT works and why to use it, read [the ACT page in the "Learn" section](/docs/concepts/protocols/access-control).
This is guide contains a detailed explanation of how to use the ACT feature, but does not cover its higher level concepts. To better understand how ACT works and why to use it, read [the ACT page in the "Concepts" section](/docs/concepts/access-control).
:::


Expand Down
9 changes: 6 additions & 3 deletions docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -136,10 +136,13 @@ module.exports = {
label: 'Incentives',
},
{
to: '/docs/concepts/protocols/pss',
label: 'Protocols',
to: '/docs/concepts/pss',
label: 'PSS',
},
{
to: '/docs/concepts/access-control',
label: 'Access Control',
},


]
},
Expand Down
12 changes: 3 additions & 9 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,18 +22,12 @@ module.exports = {
'concepts/incentives/postage-stamps',
'concepts/incentives/bandwidth-incentives',
'concepts/incentives/price-oracle',

],
collapsed: false
},
{
type: 'category',
label: 'Protocols',
items: [
'concepts/protocols/pss',
'concepts/protocols/access-control',
],
collapsed: false
},
'concepts/pss',
'concepts/access-control',


],
Expand Down
12 changes: 12 additions & 0 deletions src/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -300,3 +300,15 @@ body[data-rh] .redocusaurus {
display: block;
}

.responsive-image {
width: 90%;
display: block;
margin: auto;
}

/* Apply 50% width for screens larger than 768px (tablet and desktop) */
@media (min-width: 768px) {
.responsive-image {
width: 60%;
}
}
Binary file added static/img/bos_fig_1_1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/bos_fig_2_3.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/bos_fig_2_7.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 4395cb0

Please sign in to comment.