Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add re-registration on expiry for p2p node #685

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/src/storage-provider-cli/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ Rendezvous point address that the registration node connects to or the bootstrap

Peer ID of the rendezvous point that the registration node connects to. Only needed if running a registration P2P node.

### `--registration-ttl`

The TTL of the p2p registration in seconds. After the node registration expires, the server automatically re-registers itself.

### `--config`

Takes in a path to a configuration file, it supports both JSON and TOML (files _must_ have the right extension).
Expand All @@ -117,6 +121,7 @@ The supported configuration parameters are:
| `p2p_key` | NA |
| `rendezvous_point_address` | NA |
| `rendezvous_point` | `None` |
| `registration_ttl` | `24 hours` |

#### Bare bones configuration

Expand Down
12 changes: 12 additions & 0 deletions storage-provider/server/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ use crate::{
DEFAULT_NODE_ADDRESS,
};

pub const DEFAULT_REGISTRATION_TTL: u64 = 86400;

/// Default address to bind the RPC server to.
const fn default_rpc_listen_address() -> SocketAddr {
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), 8000)
Expand All @@ -31,6 +33,11 @@ const fn default_parallel_prove_commits() -> NonZero<usize> {
unsafe { NonZero::new_unchecked(2) }
}

/// Default registration TTL, how long the node is registered.
const fn default_registration_ttl() -> u64 {
DEFAULT_REGISTRATION_TTL
}

fn default_node_address() -> Url {
Url::parse(DEFAULT_NODE_ADDRESS).expect("DEFAULT_NODE_ADDRESS must be a valid Url")
}
Expand Down Expand Up @@ -123,4 +130,9 @@ pub struct ConfigurationArgs {
#[serde(default, deserialize_with = "string_to_peer_id_option")]
#[arg(long)]
pub(crate) rendezvous_point: Option<PeerId>,

/// TTL of the p2p registration in seconds
#[serde(default = "default_registration_ttl")]
#[arg(long, default_value_t = DEFAULT_REGISTRATION_TTL)]
pub(crate) registration_ttl: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we use Duration here. That way the reader can immediately see how the duration is parsed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tick it makes sense but we re-use this value for the register call in the swarm which expects a Option<u64>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok as is, but for reference, you can always just .as_secs

}
6 changes: 6 additions & 0 deletions storage-provider/server/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,9 @@ pub struct Server {
/// PeerID of the bootstrap node used by the registration node.
/// Optional because it is not used by the bootstrap node.
rendezvous_point: Option<PeerId>,

/// TTL of the p2p registration in seconds
registration_ttl: u64,
}

impl TryFrom<ServerCli> for Server {
Expand Down Expand Up @@ -353,6 +356,7 @@ impl TryFrom<ServerCli> for Server {
p2p_key: args.p2p_key,
rendezvous_point_address: args.rendezvous_point_address,
rendezvous_point: args.rendezvous_point,
registration_ttl: args.registration_ttl,
})
}
}
Expand Down Expand Up @@ -481,6 +485,7 @@ impl Server {
p2p_key: self.p2p_key,
rendezvous_point_address: self.rendezvous_point_address,
rendezvous_point: self.rendezvous_point,
registration_ttl: self.registration_ttl,
};

Ok(SetupOutput {
Expand Down Expand Up @@ -568,6 +573,7 @@ fn spawn_p2p_task(
p2p_state.p2p_key,
p2p_state.rendezvous_point_address,
rendezvous_point,
p2p_state.registration_ttl,
);
Ok(tokio::spawn(run_register_node(config, cancellation_token)))
}
Expand Down
21 changes: 14 additions & 7 deletions storage-provider/server/src/p2p/bootstrap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ use libp2p::{
};

use super::P2PError;
use crate::config::DEFAULT_REGISTRATION_TTL;

#[derive(NetworkBehaviour)]
pub struct BootstrapBehaviour {
Expand Down Expand Up @@ -39,7 +40,7 @@ impl BootstrapConfig {
.with_behaviour(|key| BootstrapBehaviour {
// Rendezvous server behaviour for serving new peers to connecting nodes.
rendezvous: rendezvous::server::Behaviour::new(
rendezvous::server::Config::default(),
rendezvous::server::Config::default().with_max_ttl(DEFAULT_REGISTRATION_TTL), // Max TTL of 24 hours
),
// The identify behaviour is used to share the external address and the public key with connecting clients.
identify: identify::Behaviour::new(identify::Config::new(
Expand All @@ -65,11 +66,8 @@ pub(crate) async fn bootstrap(
swarm.listen_on(addr)?;
while let Some(event) = swarm.next().await {
match event {
SwarmEvent::ConnectionEstablished { peer_id, .. } => {
tracing::info!("Connected to {}", peer_id);
}
SwarmEvent::ConnectionClosed { peer_id, .. } => {
tracing::info!("Disconnected from {}", peer_id);
SwarmEvent::NewListenAddr { address, .. } => {
tracing::info!("Listening on {}", address);
}
SwarmEvent::Behaviour(BootstrapBehaviourEvent::Rendezvous(
rendezvous::server::Event::PeerRegistered { peer, registration },
Expand All @@ -95,7 +93,16 @@ pub(crate) async fn bootstrap(
);
}
}
_other => {}
SwarmEvent::Behaviour(BootstrapBehaviourEvent::Rendezvous(
rendezvous::server::Event::RegistrationExpired(registration),
)) => {
tracing::info!(
"Registration for peer {} expired in namespace {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we do when the registration expires? What does it mean for the node to be registered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the registration expires the registration node re-registers automatically.
A node being registered means that they have identified themselves with the bootstrap node and shared their Registration with them. This Registration holds information like their Peer ID and multiaddr.

registration.record.peer_id(),
registration.namespace
);
}
other => tracing::debug!("Encountered event: {other:?}"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this catch the previous ConnectionEstablished and ConnectionClosed events?

}
}
Ok(())
Expand Down
13 changes: 10 additions & 3 deletions storage-provider/server/src/p2p/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ pub(crate) use register::RegisterConfig;
const P2P_NAMESPACE: &str = "polka-storage";

#[derive(Default, Debug, Clone, Copy, ValueEnum, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum NodeType {
#[default]
Bootstrap,
Expand Down Expand Up @@ -67,6 +68,9 @@ pub(crate) struct P2PState {
/// PeerID of the bootstrap node used by the registration node.
/// Optional because it is not used by the bootstrap node.
pub(crate) rendezvous_point: Option<PeerId>,

/// TTL of the p2p registration in seconds
pub(crate) registration_ttl: u64,
}

/// Deserializes a ED25519 private key into a Keypair.
Expand Down Expand Up @@ -141,14 +145,17 @@ pub async fn run_register_node(
) -> Result<(), P2PError> {
tracing::info!("Starting P2P register node");
let tracker = TaskTracker::new();
aidan46 marked this conversation as resolved.
Show resolved Hide resolved
let (swarm, rendezvous_point_address, rendezvous_point) = config.create_swarm()?;
let rendezvous_point = config.rendezvous_point;
let rendezvous_point_address = config.rendezvous_point_address.clone();
let registration_ttl = config.registration_ttl;
let mut swarm = config.create_swarm()?;

tokio::select! {
res = register(
swarm,
&mut swarm,
rendezvous_point,
rendezvous_point_address,
None,
registration_ttl,
Namespace::from_static(P2P_NAMESPACE),
) => {
if let Err(e) = res {
Expand Down
143 changes: 80 additions & 63 deletions storage-provider/server/src/p2p/register.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
use std::time::Duration;

use libp2p::{
futures::StreamExt,
identify,
Expand All @@ -9,6 +7,7 @@ use libp2p::{
swarm::{NetworkBehaviour, SwarmEvent},
tcp, yamux, Multiaddr, PeerId, Swarm, SwarmBuilder,
};
use tokio::time::Duration;

use super::P2PError;

Expand All @@ -20,24 +19,27 @@ pub struct RegisterBehaviour {

pub struct RegisterConfig {
keypair: Keypair,
rendezvous_point_address: Multiaddr,
rendezvous_point: PeerId,
pub(crate) rendezvous_point_address: Multiaddr,
pub(crate) rendezvous_point: PeerId,
pub(crate) registration_ttl: u64,
}

impl RegisterConfig {
pub fn new(
keypair: Keypair,
rendezvous_point_address: Multiaddr,
rendezvous_point: PeerId,
registration_ttl: u64,
) -> Self {
Self {
keypair,
rendezvous_point_address,
rendezvous_point,
registration_ttl,
}
}

pub fn create_swarm(self) -> Result<(Swarm<RegisterBehaviour>, Multiaddr, PeerId), P2PError> {
pub fn create_swarm(self) -> Result<Swarm<RegisterBehaviour>, P2PError> {
let swarm = SwarmBuilder::with_existing_identity(self.keypair)
.with_tokio()
.with_tcp(
Expand All @@ -59,83 +61,98 @@ impl RegisterConfig {
.with_swarm_config(|cfg| cfg.with_idle_connection_timeout(Duration::from_secs(10)))
.build();

Ok((swarm, self.rendezvous_point_address, self.rendezvous_point))
Ok(swarm)
}
}

/// Register the peer with the rendezvous point.
/// The ttl is how long the peer will remain registered in seconds.
pub(crate) async fn register(
mut swarm: Swarm<RegisterBehaviour>,
swarm: &mut Swarm<RegisterBehaviour>,
rendezvous_point: PeerId,
rendezvous_point_address: Multiaddr,
ttl: Option<u64>,
ttl: u64,
namespace: Namespace,
) -> Result<(), P2PError> {
tracing::info!("Attempting to register with rendezvous point {rendezvous_point} at {rendezvous_point_address}");
let mut register_tick = tokio::time::interval(Duration::from_secs(ttl));

// Dial into bootstrap address
swarm.dial(rendezvous_point_address.clone())?;
// Get and add external address
let external_addr = get_external_address(swarm).await;
swarm.add_external_address(external_addr);

while let Some(event) = swarm.next().await {
match event {
SwarmEvent::NewListenAddr { address, .. } => {
tracing::info!("Listening on {}", address);
}
SwarmEvent::ConnectionClosed {
peer_id,
cause: Some(error),
..
} if peer_id == rendezvous_point => {
tracing::info!("Lost connection to rendezvous point {}", error);
}
// once `/identify` did its job, we know our external address and can register
SwarmEvent::Behaviour(RegisterBehaviourEvent::Identify(
identify::Event::Received { info, .. },
)) => {
// Register our external address.
tracing::info!("Registering external address {}", info.observed_addr);
swarm.add_external_address(info.observed_addr);
if let Err(error) = swarm.behaviour_mut().rendezvous.register(
namespace.clone(),
rendezvous_point,
ttl,
) {
loop {
tokio::select! {
jmg-duarte marked this conversation as resolved.
Show resolved Hide resolved
// Poll tick every TTL to re-register.
// First tick completes immediately.
_ = register_tick.tick() => {
tracing::info!("Registering with p2p node");
// Dial to establish a connection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @cernicc knows a bit more than me on this, but I think these logs should be contained inside an #[instrument] or a span to keep context

// Dial is needed because the connection is not kept alive using rendezvous protocol.
swarm.dial(rendezvous_point_address.clone())?;
// Register with bootstrap node.
if let Err(error) =
swarm
.behaviour_mut()
.rendezvous
.register(namespace.clone(), rendezvous_point, Some(ttl))
{
tracing::error!("Failed to register: {error}");
return Err(P2PError::RegistrationFailed(rendezvous_point));
} else {
tracing::info!("Registration with {rendezvous_point} successful");
}
aidan46 marked this conversation as resolved.
Show resolved Hide resolved
}
SwarmEvent::Behaviour(RegisterBehaviourEvent::Rendezvous(
rendezvous::client::Event::Registered {
namespace,
ttl,
rendezvous_node,
},
)) => {
tracing::info!(
"Registered for namespace '{}' at rendezvous point {} for the next {} seconds",
namespace,
rendezvous_node,
ttl
);
return Ok(());
}
SwarmEvent::Behaviour(RegisterBehaviourEvent::Rendezvous(
rendezvous::client::Event::RegisterFailed {
rendezvous_node,
namespace,
error,
},
// Check incoming event.
event = swarm.select_next_some() => check_swarm_event(event)
}
}
}

fn check_swarm_event(event: SwarmEvent<RegisterBehaviourEvent>) {
match event {
aidan46 marked this conversation as resolved.
Show resolved Hide resolved
SwarmEvent::Behaviour(RegisterBehaviourEvent::Rendezvous(
rendezvous::client::Event::Registered {
namespace,
ttl,
rendezvous_node,
},
)) => {
tracing::info!(
"Registered for namespace '{}' at rendezvous point {} for the next {} seconds",
namespace,
rendezvous_node,
ttl
);
}
SwarmEvent::Behaviour(RegisterBehaviourEvent::Rendezvous(
rendezvous::client::Event::RegisterFailed {
rendezvous_node,
namespace,
error,
},
)) => {
tracing::error!(%rendezvous_node, %namespace,
"Failed to register error = {error:?}"
aidan46 marked this conversation as resolved.
Show resolved Hide resolved
);
}
other => tracing::debug!("Encountered event: {other:?}"),
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring would be appreciated

async fn get_external_address(swarm: &mut Swarm<RegisterBehaviour>) -> Multiaddr {
loop {
match swarm.select_next_some().await {
// once `/identify` did its job, we know our external address and can return it
SwarmEvent::Behaviour(RegisterBehaviourEvent::Identify(
identify::Event::Received { info, .. },
)) => {
tracing::error!(
"Failed to register: rendezvous_node={}, namespace={}, error_code={:?}",
rendezvous_node,
namespace,
error
);
return Err(P2PError::RegistrationFailed(rendezvous_node));
tracing::info!("Identity information exchanged, external address received");
return info.observed_addr;
}
_other => {}
other => tracing::debug!("Encountered event: {other:?}"),
}
}

Ok(())
}
Loading