Failing to make async request to nodes #671

tedim52 · 2025-02-12T16:56:04Z

Describe the bug
I'm trying to create a local etna devnet. I've following the steps here to do so:
https://github.com/ava-labs/etna-devnet-resources/tree/main

All the previous steps are working:

generating genesis
creating chain
converting subnet to L1
deploying validator manager contract and initializing it

The only thing thats failing right now is the last step to initialize the validator set. I'm following exactly the code here:

https://github.com/ava-labs/etna-devnet-resources/blob/eba4c775127b66a466ba7429ff46ee8613bb9aa1/manual_etna_evm/cmd/01_12_initialize_validator_set.go#L60

To Reproduce

...
func initializeValidatorSet(
  signatureAggregator, err := interchain.NewSignatureAggregator(
    network,
    logging.Level(logging.Debug),
    subnetId,
    interchain.DefaultQuorumPercentage,
    true,
    peers,
  )
  if err != nil {
    return fmt.Errorf("failed to create signature aggregator: %w", err)
  }

  subnetConversionSignedMessage, err := signatureAggregator.Sign(subnetConversionUnsignedMessage, subnetId[:])
  if err != nil {
    return fmt.Errorf("failed to sign subnet conversion unsigned message: %w", err)
  }
...

The part that is failing is the signatureAggregator.Sign(subnetConversionUnsignedMessage, subnetId[:]). I've attached logs below. It seems like the aggregator is failing to make requests to the node. I am running ava go node avaplatform/avalanchego:v1.12.0-fuji

Expected behavior
The signature aggregator is able to communicate with nodes and receive signatures.

Logs

{"level":"debug","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:230","msg":"Aggregator collecting signatures from peers.","attempt":10,"sourceBlockchainID":"11111111111111111111111111111111LpoYY","signingSubnetID":"2cJveLDAymdrcLAt7nsLsHwA1J59bLUSBzoz3hom7yK5V443wT","validatorSetSize":2,"signatureMapSize":0,"responsesExpected":2}
{"level":"debug","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:250","msg":"Added node ID to query.","nodeID":"NodeID-EYtLNPTJn5YhvYUmcDhG8CyDmUpKYZmzC","warpMessageID":"yyCAW17g4ekyyzCUJKSd4Z86yBVQnNjGoCw3LiPncK6kWjEtf","sourceBlockchainID":"11111111111111111111111111111111LpoYY"}
{"level":"debug","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:250","msg":"Added node ID to query.","nodeID":"NodeID-ALEhUtjbpiZrgpdXUJCh7EJdX4LwLjCDe","warpMessageID":"yyCAW17g4ekyyzCUJKSd4Z86yBVQnNjGoCw3LiPncK6kWjEtf","sourceBlockchainID":"11111111111111111111111111111111LpoYY"}
{"level":"debug","timestamp":"2025-02-12T11:46:50.161-0500","logger":"p2p-network","caller":"peers/external_handler.go:132","msg":"Registering request ID","requestID":1231224323}
{"level":"debug","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:270","msg":"Sent signature request to network","warpMessageID":"yyCAW17g4ekyyzCUJKSd4Z86yBVQnNjGoCw3LiPncK6kWjEtf","sentTo":[],"sourceBlockchainID":"11111111111111111111111111111111LpoYY","sourceSubnetID":"11111111111111111111111111111111LpoYY","signingSubnetID":"2cJveLDAymdrcLAt7nsLsHwA1J59bLUSBzoz3hom7yK5V443wT"}
{"level":"warn","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:280","msg":"Failed to make async request to node","nodeID":"NodeID-ALEhUtjbpiZrgpdXUJCh7EJdX4LwLjCDe"}
{"level":"warn","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:280","msg":"Failed to make async request to node","nodeID":"NodeID-EYtLNPTJn5YhvYUmcDhG8CyDmUpKYZmzC"}
{"level":"warn","timestamp":"2025-02-12T11:46:50.161-0500","logger":"init-aggregator","caller":"aggregator/aggregator.go:342","msg":"Failed to collect a threshold of signatures","attempts":10,"warpMessageID":"yyCAW17g4ekyyzCUJKSd4Z86yBVQnNjGoCw3LiPncK6kWjEtf","accumulatedWeight":0,"sourceBlockchainID":"11111111111111111111111111111111LpoYY"}
an error occurred while initializing validator set: failed to sign subnet conversion unsigned message: failed to collect a threshold of signatures
exit status 1

Operating System
arm64

The text was updated successfully, but these errors were encountered:

meaghanfitzgerald · 2025-02-12T17:00:31Z

The Etna DevNet has been torn down now that testing of the Etna upgrade has been completed and the upgrade activated on both Fuji and Mainnet. This repository is archived for reference, but commands and tutorials in it are no longer maintained and will likely not work. For up-to-date documentation and tutorials on how to launch Avalanche L1s, see https://docs.avax.network/.

meaghanfitzgerald · 2025-02-12T17:04:33Z

I recommend this tutorial for creating an Avalanche L1 on Fuji Testnet using the CLI: https://build.avax.network/docs/tooling/create-deploy-avalanche-l1s/deploy-on-fuji-testnet.

tedim52 · 2025-02-12T17:05:44Z

Hey @meaghanfitzgerald ! Thanks for the response - the primary error/logs I provided were related to a function signatureAggregator.Sign(subnetConversionUnsignedMessage, subnetId[:]) in this repository which isn't archived / seems to be maintained. Would it be possible to get some insights on that before closing the issue?

Specifically, I'm calling this in the avalanche-cli https://github.com/ava-labs/avalanche-cli/blob/2c76e3f375fef92bdab110173cab7e6233d0ff13/sdk/interchain/signature-aggregator.go#L195 which calls into the code in this repository I believe:

icm-services/signature-aggregator/aggregator/aggregator.go

Line 144 in f9f3fe3

func (s *SignatureAggregator) CreateSignedMessage(

meaghanfitzgerald · 2025-02-18T18:49:10Z

Hey @tedim52, I believe @containerman17 and @owenwahlgren might be able to help understand why you're catching this error.

iansuvak · 2025-02-18T21:16:23Z

One of the most common reasons for the signature aggregator to not be able to reach your L1 nodes is if they don't correctly NAT traverse to automatically determine their IP. Try restarting your validator nodes with --public-ip setting configured and see if it works then.

tedim52 · 2025-02-25T17:35:24Z

Hey! Thanks for the response @iansuvak - so I am running a local avalanche node network inside a docker subnetwork. The signature aggregator code is running in a service in that same docker subnetwork. When setting the --public-ip to to the ip address of the node in that docker subnetwork -eg 172.12.0.5 - the signature aggregator fails with the same issue.

iansuvak · 2025-02-25T19:54:34Z

Hey @tedim52 . So it sounds like you are using cli to do to the conversion. If you try the latest rc version of the CLI it might work.

The problem is that the way that signature aggregator discovers how to reach each node depends on peer gossip on the primary network and it does so on demand at the time that the first request to it happens. So it is expected for the first request to take longer and subsequent ones for the same subnet/L1 to be quicker. The newest RC tag of the CLI adds a wait step to wait for the network to report as healthy and connected to the subnet before attempting to aggregate signatures.

Please give this a try and report back if it doesn't work. We are likely going to add a config workaround to allow for explicit manual tracking of IPs associated with NodeIDs intended for these private subnetwork usecases.

tedim52 · 2025-02-25T21:15:52Z

Hey - so it turns out the issue was because the validators list thats used to construct the subnetConversionUnsignedMessage the signature aggregator tries to sign, needs to be constructed in the exact same order that the validators list supplied to the IssueConvertSubnetToL1Tx. When I make sure the orders are the same, the signature aggregation works. Huge thanks to @owenwahlgren for reading through my code to find the fix !

convertSubnetToL1(){
        // The order of validators here needs to match the validators provided to subnetConversionData below
	avaGoBootstrapValidators, err := blockchaincmd.ConvertToAvalancheGoSubnetValidator(validators)
	if err != nil {
		return ids.Empty, fmt.Errorf("failed to convert to AvalancheGo subnet validator: %w", err)
	}
	for idx, validator := range avaGoBootstrapValidators {
		fmt.Printf("Ava Go Bootstrap Validators: %v %v", idx, validator.NodeID)
	}

	tx, err := w.p.P().IssueConvertSubnetToL1Tx(
		subnetId,
		chainId,
		managerAddress.Bytes(),
		avaGoBootstrapValidators,
		options...,
	)
	if err != nil {
		return ids.Empty, fmt.Errorf("an error occurred issuing convert subnet to l1 tx: %v", err)
	}
}

initValidatorSet(
        subnetConversionData := message.SubnetToL1ConversionData{
		SubnetID:       subnetId,
		ManagerChainID: blockchainId,
		ManagerAddress: managerAddress.Bytes(),
		Validators:     validators,
	}
	subnetConversionId, err := message.SubnetToL1ConversionID(subnetConversionData)
	if err != nil {
		return fmt.Errorf("failed to create subnet conversion ID: %w", err)
	}

	addressedCallPayload, err := message.NewSubnetToL1Conversion(subnetConversionId)
	if err != nil {
		return fmt.Errorf("failed to create addressed call payload: %w", err)
	}

	subnetConversionAddressedCall, err := payload.NewAddressedCall(
		nil,
		addressedCallPayload.Bytes(),
	)
	if err != nil {
		return fmt.Errorf("failed to create addressed call payload: %w", err)
	}

	network := models.NewNetwork(3, 1337, nodeRpcUri, "")

	subnetConversionUnsignedMessage, err := warp.NewUnsignedMessage(
		network.ID,
		avagoconstants.PlatformChainID,
		subnetConversionAddressedCall.Bytes(),
	)
	if err != nil {
		return fmt.Errorf("failed to create unsigned message: %w", err)
	}

        subnetConversionSignedMessage, err := signatureAggregator.Sign(subnetConversionUnsignedMessage, subnetId[:])
        if err != nil {
          return fmt.Errorf("failed to sign subnet conversion unsigned message: %w", err)
        }
)

tedim52 added the bug Something isn't working label Feb 12, 2025

meaghanfitzgerald closed this as completed Feb 12, 2025

meaghanfitzgerald reopened this Feb 18, 2025

tedim52 closed this as completed Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing to make async request to nodes #671

Failing to make async request to nodes #671

tedim52 commented Feb 12, 2025 •

edited

Loading

meaghanfitzgerald commented Feb 12, 2025

meaghanfitzgerald commented Feb 12, 2025

tedim52 commented Feb 12, 2025 •

edited

Loading

meaghanfitzgerald commented Feb 18, 2025

iansuvak commented Feb 18, 2025

tedim52 commented Feb 25, 2025

iansuvak commented Feb 25, 2025

tedim52 commented Feb 25, 2025 •

edited

Loading

Failing to make async request to nodes #671

Failing to make async request to nodes #671

Comments

tedim52 commented Feb 12, 2025 • edited Loading

meaghanfitzgerald commented Feb 12, 2025

meaghanfitzgerald commented Feb 12, 2025

tedim52 commented Feb 12, 2025 • edited Loading

meaghanfitzgerald commented Feb 18, 2025

iansuvak commented Feb 18, 2025

tedim52 commented Feb 25, 2025

iansuvak commented Feb 25, 2025

tedim52 commented Feb 25, 2025 • edited Loading

tedim52 commented Feb 12, 2025 •

edited

Loading

tedim52 commented Feb 12, 2025 •

edited

Loading

tedim52 commented Feb 25, 2025 •

edited

Loading