Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node deployment issues on Bardock and Mainnet #1003

Open
shayansanjideh opened this issue Jan 16, 2025 · 2 comments
Open

Node deployment issues on Bardock and Mainnet #1003

shayansanjideh opened this issue Jan 16, 2025 · 2 comments
Assignees

Comments

@shayansanjideh
Copy link

On Bardock, follower nodes cannot be deployed at all and on Mainnet, deployed follower nodes are unable to synchronize with the actual mainnet state.

Some gists from teams to help troubleshoot:

LayerZero, rev a209fa3: https://gist.github.com/lz-reery/52e567bfbe32dce86adb16a520e87ac0

latest commit CONTAINER_REV=4d7a8a6e56b15800f1584af47c7af05de8ca3b30 does not work
[+] Running 2/2
 ✘ setup Error              manifest unknown                                                                                                                                                        0.5s
 ✘ movement-full-node Error context canceled                                                                                                                                                        0.5s
Error response from daemon: manifest unknown

Bardock snapshot issue:

2025-01-09 13:46:20.651Z setup 2025-01-09T13:46:20.650951Z  INFO syncup: Running syncup with root "/.movement", glob {maptos,maptos-storage,movement-da-db}/**, and target S3("mtnet-l-sync-bucket-sync")
2025-01-09 13:46:20.651Z setup 2025-01-09T13:46:20.650973Z  INFO syncup: Creating pipelines for target S3("mtnet-l-sync-bucket-sync")
2025-01-09 13:46:20.670Z setup 2025-01-09T13:46:20.669970Z  INFO syncador::backend::s3::shared_bucket: Create client used region Some(Region("us-west-1"))
2025-01-09 13:46:20.670Z setup 2025-01-09T13:46:20.670305Z  INFO syncador::backend::s3::bucket_connection: Creating bucket mtnet-l-sync-bucket-sync
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.852972Z  INFO syncador::backend::s3::bucket_connection: Bucket exists: true
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.853233Z  INFO syncup: Created pipelines
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.853310Z  INFO syncup: Running pull pipeline
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.853317Z  INFO syncador::backend::pipeline::pull: Pulling from backend
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.853334Z  INFO syncador::backend::s3::shared_bucket::pull: S3 pulling package: Some(Package([]))
2025-01-09 13:46:20.853Z setup 2025-01-09T13:46:20.853362Z  INFO syncador::backend::s3::shared_bucket::pull: Finding candidates for package: Package([])
2025-01-09 13:46:20.889Z setup 2025-01-09T13:46:20.889846Z  INFO syncador::backend::s3::shared_bucket::pull: Public file paths: {}
2025-01-09 13:46:20.889Z setup 2025-01-09T13:46:20.889969Z  INFO syncador::backend::s3::shared_bucket::pull: Candidates: []
2025-01-09 13:46:20.890Z setup 2025-01-09T13:46:20.890103Z  INFO syncador::backend::pipeline::pull: Pulling from backend
2025-01-09 13:46:20.890Z setup 2025-01-09T13:46:20.890289Z  INFO syncador::backend::glob::file: Running glob push
2025-01-09 13:46:20.893Z setup 2025-01-09T13:46:20.893080Z  INFO syncador::backend::glob::file: Found 28 files matching the glob
2025-01-09 13:46:20.901Z setup 2025-01-09T13:46:20.901787Z  INFO syncador::backend::pipeline::pull: Pulling from backend
2025-01-09 13:46:20.901Z setup 2025-01-09T13:46:20.901805Z  INFO syncador::backend::archive::gzip::pull: Archive pulling package: None
2025-01-09 13:46:20.901Z setup 2025-01-09T13:46:20.901812Z  INFO syncup: No package pulled
@0xmovses 0xmovses assigned 0xmovses and musitdev and unassigned 0xmovses Jan 16, 2025
@l-monninger
Copy link
Collaborator

l-monninger commented Jan 16, 2025

Follower Replay and Stream Issues

These are addressed and being addressed in #996:

  • We identified error propagation in the stream that needed to be softened. These errors were more likely to be encountered when the replay distance was large.
  • We identified stream closures over http1 regardless of transaction volume when the light node service is served behind nginx that we determined to be unresolvable.
  • We identified stream closures over http2 owing to idle connections experienced during low transaction volume when the light node service is used behind nginx with grpc_pass. We have taken a look at changes on the proxy and transport level, but have not discovered a fix on either.

@l-monninger
Copy link
Collaborator

Follower Replay and Stream Issues

These are addressed and being addressed in #996:

  • We identified error propagation in the stream that needed to be softened. These errors were more likely to be encountered when the replay distance was large.[x] We identified stream closures over http1 regardless of transaction volume when the light node service is served behind nginx that we determined to be unresolvable.[ ] We identified stream closures over http2 owing to idle connections experienced during low transaction volume when the light node service is used behind nginx with grpc_pass. We have taken a look at changes on the proxy and transport level, but have not discovered a fix on either.

@musitdev I think the quickest fix is simply to send the Nolo certificates through to the client. We can update v1beta2 to include a heartbeat variant: https://github.com/movementlabsxyz/movement/blob/1913e58f91432bbd9aa9400309379c50e[…]0/proto/movementlabs/protocol_units/da/light_node/v1beta1.proto

Celestia will produce blobs every two seconds, so we can just forward these along as a heartbeat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants