Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sequencer check #22

Closed
wants to merge 1 commit into from
Closed

fix: sequencer check #22

wants to merge 1 commit into from

Conversation

bryanchriswhite
Copy link

@bryanchriswhite bryanchriswhite commented Jan 30, 2024

Description

Background

While attempting to migrate Pocket Network's rollkit-cosmos-sdk-based rollup, poktroll from rollkit/cosmos-sdk v0.47.3-rollkit-v0.10.6-no-fraud-proofs to v0.50.1-rollkit-v0.11.9-no-fraud-proofs, we encountered an issue that seems to be related to importing the genesis.json during chain initialization.

Since our appchain doesn't yet have any live networks running it, our simplified migration strategy (as opposed to applying the migration guide directly) is to re-scaffold substantial portions of the appchain (using ignite-cli), particularly with respect to app wiring and dependency injection. We plan to re-scaffold our modules next and then subsequently transplant our existing types, business logic, and tests into this fresh scaffold. However, before we even got to the module re-scaffolding step, we encountered this genesis import issue.

Initially, we encountered a slightly different issue that was coming from a github.com/rollkit/rollkit package and seems to be related to changes that were introduced between version:

rollkit/types.NewFromGenesisDoc():64

...
	if len(genDoc.Validators) != 1 {
		return State{}, fmt.Errorf("must have exactly 1 validator (the centralized sequencer)")
	}
...

The solution to this issue was to apply the same modification to the post- ignite chain build genesis.json (derived from our config.yml) as the rollkit init-local.sh script does (as per the rollkit docs):

# copy centralized sequencer address into genesis.json
# Note: validator and sequencer are used interchangeably here
ADDRESS=$(jq -r '.address' ~/.gm/config/priv_validator_key.json)
PUB_KEY=$(jq -r '.pub_key' ~/.gm/config/priv_validator_key.json)
jq --argjson pubKey "$PUB_KEY" '.consensus["validators"]=[{"address": "'$ADDRESS'", "pub_key": $pubKey, "power": "1000", "name": "Rollkit Sequencer"}]' ~/.gm/config/genesis.json > temp.json && mv temp.json ~/.gm/config/genesis.json

This inserts a validators property into the top-level consensus property (not to be confused with that in the staking top-level property nor a top-level validators property):

image

After making this modification to the genesis.json we were able to progress to the next issue (below), which we confirmed is downstream (in terms of execution / chain initialization) and seems to be rooted in an incompatibility between this modified genesis.json structure and how cosmos-sdk is trying to import and validate it.

Symptoms

Because the genesis now has a consensus.validators array, the if len(req.Validators) > 0 in baseapp/abci.go:111 is evaluating to true, which causes the !proto.Equal(...) assertion to be evaluated. This assertion is subsequently failing, seemingly for two distinct reasons.

image

In the context above req is an object that represents the parsed genesis.json (which importantly includes genutils "gentx"s). res represents an object that is the result of attempting to import the genesis (via a quite convoluted execution path) and ultimately apply validator updates to the validator set in the staking module. This mechanism provides an opportunity, apparently by design, to modify res.Validators[0] such that it may no longer match req.Validators[0], necessitating the need for this check and causing this issue.

This process can be extended by providing a custom module.InitChainer function in app.go:

	// A custom InitChainer can be set if extra pre-init-genesis logic is required.
	// By default, when using app wiring enabled module, this is not required.
	// For instance, the upgrade module will set automatically the module version map in its init genesis thanks to app wiring.
	// However, when registering a module manually (i.e. that does not support app wiring), the module version map
	// must be set manually as follow. The upgrade module will de-duplicate the module version map.
	//
	// app.SetInitChainer(func(ctx sdk.Context, req *abci.RequestInitChain) (*abci.ResponseInitChain, error) {
	// 	app.UpgradeKeeper.SetModuleVersionMap(ctx, app.ModuleManager.GetVersionMap())
	// 	return app.App.InitChainer(ctx, req)
	// })

1. Validator power mismatch:

Part of the x/staking/keeper.Keeper#ApplyAndReturnValidatorSetUpdates() method involves calculating the validator "power", as opposed to the number of staked tokens. It seems that the reduction of this number (via x/staking/types.Validator#ConsensusPower) is causing a mismatch between the values of the res and req validator slices:

image

2. Validator pubkey slice capacity mismatch:

Even after correcting for the power mismatch, we still see the same error. The only remaining difference between the req and res validator slices seems to be the capacity of the "internal" uint8 slices containing their public key bytes:

image

Versions:

Go: 1.21.6
Ignite-cli: 28.1.0 (same results with 28.1.1)
Cosmos-SDK: 0.47.3 => 0.50.1
Rollkit: 0.10.6 => 0.11.9

How to reproduce

Assuming go version 1.21.6 is the only go version in the $PATH:

# Install `kind` (skip if already installed)
go install sigs.k8s.io/[email protected]
# Create a `kind` cluster (skip if already exists)
kind create cluster

# Clone the `helm-charts` repo (as a sibling to `poktroll`) & checkout the `v0.50-migration` branch
$(cd ../ && git clone https://github.com/pokt-network/helm-charts.git && git checkout v0.50-migration)

git clone https://github.com/pokt-network/poktroll
# Use the local helm-charts
sed -i -E 's,enabled: false,enabled: true,' ./localnet_config.yaml
git checkout repro/sequencer-check-issue
make install_ci_deps
go get [github.com/bufbuild/buf@latest](http://github.com/bufbuild/buf@latest)  # may be redundant
make localnet_up

After the repo finishes building, tilt will start, at which point you can press "space" to open the web UI.

image

The selecting the "sequencer" service will present the relevant logs which should contain the error as outlined above.

NOTE: If you see errors regarding go binaries and/or packages being compiled with a different version of go than 1.21.6, consider wiping the module cache with go clean -modcache.

Not sure if ...

Based on the nature of the error, assuming we haven't overlooked some detail, I would expect that any rollkit-cosmos-sdk based project attempting a similar migration will also experience this same issue. This makes me question whether we did indeed overlook something as the only alternative I can imagine is that we may just be the first to try this migration using freshly scaffolded wiring and/or to get this far in the migration in general.

image

Fig 1. @h5law (left) and @bryanchriswhite (right) expressing our confusion and uncertainty when faced with the conclusions we've inevitably arrived at based on our observations.


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • run make lint and make test
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

Co-autored-by: Harry Law <[email protected]>

(cherry picked from commit 1d27fde)
Copy link

coderabbitai bot commented Jan 30, 2024

Important

Auto Review Skipped

Auto reviews are disabled on base/target branches other than the default branch. Please add the base/target branch pattern to the list of additional branches to be reviewed in the settings.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@bryanchriswhite bryanchriswhite changed the title fix: sequencer power calculation fix: sequencer check Jan 30, 2024
@bryanchriswhite bryanchriswhite marked this pull request as ready for review January 30, 2024 09:38
if !proto.Equal(&res.Validators[i], &req.Validators[i]) {
return nil, fmt.Errorf("genesisValidators[%d] != req.Validators[%d] ", i, i)
// Get the public key bytes for request and response validators at index i.
// NB: The *capacity* of tye PubKey.Sum uint8 byte slice is inconsistent
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// NB: The *capacity* of tye PubKey.Sum uint8 byte slice is inconsistent
// NB: The *capacity* of the PubKey.Sum uint8 byte slice is inconsistent

@bryanchriswhite
Copy link
Author

bryanchriswhite commented Feb 7, 2024

Closing as this turned out to be a false alarm. The issue resulted from a misunderstanding in the application of the init-local.sh changes that was never reconciled with the subsequent realization of the distinction between "stake" and "power.

So long as the "power" is hard-coded in the init-local.sh (or analogue thereof; Makefile, in our case), there's a risk that it gets out of sync with the "stake" specified in config.yml. The same concern applies to the "name" field.

Here's an updated version of the relevant section of init-local.sh:

# copy centralized sequencer address into genesis.json
# Note: validator and sequencer are used interchangeably here
ADDRESS=$(jq -r '.address' ~/.gm/config/priv_validator_key.json)
PUB_KEY=$(jq -r '.pub_key' ~/.gm/config/priv_validator_key.json)
# NB: Currently the stake => power calculation is constant; however, cosmos-sdk
# intends to make this parameterizable in the future.
POWER=$(yq ".validators[0].bonded" ./config.yml | sed 's,000000upokt,,')
NAME=$(yq ".validators[0].name" ./config.yml)
jq --argjson pubKey "$PUB_KEY" '.consensus["validators"]=[{"address": "'$ADDRESS'", "pub_key": $pubKey, "power": "'$POWER'", "name": "'$NAME'"}]' ~/.gm/config/genesis.json > temp.json && mv temp.json ~/.gm/config/genesis.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant