Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leanpocket V0 #4

Open
wants to merge 79 commits into
base: staging
Choose a base branch
from
Open

Leanpocket V0 #4

wants to merge 79 commits into from

Conversation

nodiesBlade
Copy link
Collaborator

@nodiesBlade nodiesBlade commented May 15, 2022

For transparency sakes:

This linked PR outlines the comments and conversations during the time in which LeanPokt was closed source. To read more about the reason behind closed source, please read Core team's comments on why it was closed source.

Disclaimer:

This PR has only been peer reviewed and is not Q/A tested yet. It should not be used in a production fleet. This should be only be used for testing purposes until it is properly Q/A tested. If you do run in production, do know you are assuming the risks of:

  1. Missing rewards
  2. Causing chain halt
  3. Disrupting servicing quality

Name: ThunderStake (Pierre Spiegel & Addison Spiegel) | BaaS Pools LLC

Design Specification
pokt-network/pocket-core#1437

Please explain your change in detail.
LeanPocket is a large optimization to the Pocket Core’s Client (PCC) by allowing multiple nodes to utilize one full node. Servicers/Validators (node set) can now leverage the same state cache, and blockchain data, and no longer have to validate as many transactions in a block as the node set grows. This reduces the number of resources needed for “n” nodes to a constant number, O(N) to O(1) for memory, space, io, and networking. Read more here

Functional Goals

  • Add multiple servicers support under a PCC to handle relays
  • Add multiple validators' support under a PCC to participate in voting rounds and block proposing.
  • Add additional Prometheus metrics for the node set

Modifications made to achieve aforementioned goals

  • Introduction of the PocketNode struct to represent one node under PCC and a Global Map of nodes that maps a address to PocketNode for each incoming each relay
  • Relay logic changed to chose a specific node based off incoming relay, using the PocketNode struct
  • Merged in Feather Tendermint code with slight modifications and bug fixes, found here
  • Prometheus metrics are now being set with a validator_address tag
  • Add backwards compabaility of legacy Tendermint files when lean pocket is disabled
  • Add metric to keep track of the average time to generate the necessary data to generate a claim and proof
  • Added a less expensive relay call by modifying tendermint to expose IsCatchingUp without overhead of loading all validators

Misc changes

  • Removed some potential lock contention by introducing a RWLock
  • Fixed #1428 by submitting the correct UPoktEarned value

Design decisions

  • This proposal aims to be backward compatible with the current latest RC on the mainnet, keeping the same exact functionality if node runners decide to do so. Therefore, instead of completely modifying the relay and validator logic directly, we propose that we add additional functionality to both validators and services that are controlled through a feature flag in Pocket Config called lean_pocket (bool) or running with start argument useLean.
  • We kept a siloed in-memory session cache & evidence cache/DB for each node to prevent any lock contentions or observed behaviors
  • All additional functionality will be prefixed or suffixed with lean and can later be removed or integrated completely in future RCs.

Quick User Guide

  1. Create a file inside your Pocket data dir (ex: /home/user/.pocket) called lean_nodes_keys.json and format the file to look like a JSON array of private keys as follows:
    [ { "priv_key": "6f641136790803051ea5b944113434017560d10e10a1b319b6a62146bc83ee361b5d51ada3a55ed716a6e614c4a1459f19ef838cf8e3b20d7c90553d27c9fb61" } ]
  2. Set/Update your validators by running pocket accounts set-validators <path to lean_nodes_keys.json>
  3. Add lean_pocket : true to your pocket configuration json file or launch pocket start with --useLean
  4. Start Pocket Core 🚀

Suggestions:

  1. Use --forceSetValidators as part of pocket start argument to ensure that your validators are updated on each pocket start.
  2. Do not use -- useCache as part of pocket start argument as it can delay restarting your full node

Testing

  • Added modified cache unit tests
  • Added Lean Relay unit tests

Linked Issues
#1428

@nodiesBlade nodiesBlade changed the title Leanpocket v0 [DRAFT] Leanpocket V0 May 15, 2022
@nodiesBlade nodiesBlade force-pushed the leanpocket-v0 branch 2 times, most recently from 3724489 to 678e7af Compare June 23, 2022 17:50
@nodiesBlade nodiesBlade requested review from andrewnguyen22 and luyzdeleon and removed request for andrewnguyen22 June 23, 2022 23:02
@nodiesBlade nodiesBlade changed the title [DRAFT] Leanpocket V0 Leanpocket V0 Jun 23, 2022
@nodiesBlade
Copy link
Collaborator Author

nodiesBlade commented Jun 23, 2022

Hot reloading was moved out of this release goal - as we don't find it release blocking and there can be additional code changes that we would need to account for to ensure that it is synchronized across PCC and TM. This can be a fast follow or a seperate PR review.

TODO:

  • Rebase/Merge Staging for V0.8.3
  • Change Tendermint dependency to the correct origin once published

Copy link
Contributor

@luyzdeleon luyzdeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nodiesBlade
Copy link
Collaborator Author

Unit test is failing: https://gist.github.com/luyzdeleon/c54fb0a449c5f5312521e28544bdd258

Good catch, I pushed a fix for the unit test

Copy link
Contributor

@luyzdeleon luyzdeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to interpret this peer review

This peer review has 4 sections:

  • Comments: These are general comments, mostly around architecture, design, nomenclature and other items, these are not meant to be strictly enforced, but are suggestions trying to improve the quality of this submission. Any requested changes will be in the body of the PR itself signaled at the line of the requested change.
  • Follow up action items: These are items that are future improvements that can be implemented in a later revision.
  • Failed unit tests: this is an account of failed unit tests performed by the reviewer to be fixed, if there are failed unit tests this PR is not going to be approved.
  • Failed functional tests: This is a smoke screen test, where a few relevant functional use cases that touch this functionality are performed to make sure the functionality works. This does not replace a full regression test run, it's meant to ensure the reviewer fully understands the scope of the changes.

Comments

  • I think that the "Lean" prefix is a necessary nomenclature to differentiate, but it might cause confusion and duplication across the codebase, e.g. ValidateLean, isSealedLean and several other functions. Existing functionality could have been easily extended via the PocketConfig.LeanPocket global config flag. One thing to note is that I do agree with CLI functions such as get-validators-lean, because they allow the user to easily differentiate, however the codebase itself could have adapted without the code duplication.
  • I'm assuming forceSetValidatorsLean is meant to allow people to transform their current setups into "LeanPocket" without changing anything and just appending their existing validator to the list. If this is the case I like it and I think it will get people converted to the feature more easily.
  • A lot of the new functions do not have even a general comment about what they do at the signature of the function which could go a long way while reviewing/refactoring this code.
  • I saw your ? above the IsSealed and Seal functions in the Session struct so I want to provide some more context: remember that the CacheObject interface is shared with the Evidence struct, which is sealed for the potential relay leak problem, in this case Sessions don't need to be sealed, which means isSealed just needs to return false.

Tendermint comments

  • Several variables seem to be converted to slices which has singular names such as LastSignState and Key, which could be converted to plural for better readability.
  • I would like to better understand why the Synced function was added at several places in the code.
  • In the http package, the Synced function always returns nil, is this expected behaviour?
  • I don't see unit tests in the changes to the tendermint side.

Follow up action items (for a later revision)

  • I don't know if I fully agree to having separated SessionCache per LeanNode, when you could have a global cache, even though I read your comment about lock contentions or observed behaviors, but it could be advantageous in terms of reducing complexity and saving a bit of memory in cases where the siloed caches might contain the same information.

Failed unit tests

Failed functional tests

  • Functional tests were not performed as part of this review due to timeline and tooling restrictions.

x/pocketcore/types/service.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/claim.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/proof.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Outdated Show resolved Hide resolved
app/query.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Outdated Show resolved Hide resolved
@nodiesBlade
Copy link
Collaborator Author

nodiesBlade commented Jun 28, 2022

@luyzdeleon

DRY Concerns:

I purposefully kept the code siloed as I was not sure to what extent making modifications to the original logic was going to be allowed and to allow for full backwards compatibility on the Pocket Core layer (though can still be achieve without DRY, just found this easier). I can make these modifications to reduce DRY by modifying arguments, parameters, and integrating logic between each methods but there is an added risk of modifying something that I shouldn't have and may not be caught in peer review.

Here's are my suggestions:

  • The code while violating partially DRY - is still very clearly siloed code wise, allowing for easy consolidation between the methods in a future PR which I can pledge to do. This won't be my last PR contribution :) If considered release blocking by Core team NOW, then of course I will consolidate. I will push back on the grounds that it is not release blocking, and we should pursue to not make too many more changes then peer bug review fixes as long as we have a plan to remediate

If consolidating here's what I suggest (which aligns with what you recommend):

  • HandleRelay, Cache methods, validating methods etc are all methods that accept validator address
  • Downstream methods will pass in the proper address by performing a config check if lean pocket is enabled and passing in the correct address

This should remove the majority of THE DRY code.

Tendermint Sync Comment

Never thought a discord message will ever end up into a Github PR, without further ado. Here is my reasoning by adding the sync method. It allows for more efficient relays

https://discord.com/channels/553741558869131266/564836328202567725/983072466157072464

I wasn't aware that we are using the HTTP Client, so i had it as an empty stub for now. It was under assumption we only use the local client, but if needed, I can add the functionality.

Tendermint comments

  • will look into adding unit tests, renaming

Failed unit tests

  • will look into them

@luyzdeleon
Copy link
Contributor

luyzdeleon commented Jun 28, 2022

I agree with the DRY consolidation strategy in that we need to consolidate now, it's what's best for the project long term, as long as we have the correct unit/functional testing in place that confirms both the current functionality and the new functionality is working as intended.

@nodiesBlade
Copy link
Collaborator Author

nodiesBlade commented Jul 5, 2022

Merged in #12 PR

Added a new struct called PocketNode

  • represents an entity in the network that is able to handle dispatches, servicing, challenges, and submit proofs/claims.
  • contains private key and respective evidence stores and singleton check for if cache has been init before.

DRY Fixes

  • Most duplicate logic removed, now has a very similar call tree for challenge and relay requests
  • HandleRelay and HandleChallenge now references a PocketNode and passes downstream dependencies such as EvidenceStore and SessionStore as needed. (And these dependences also now accept CacheStorage pointers)

CacheObject interface changes

  • Removed isSealed(), Seal()
  • dded isSealable() (session obj returns false, evidence returns true

CacheObject changes

  • struct includes a sync.Map to keep track of sealed objects
  • New function isSealed(CacheObject), - uses embedded seal map to check if it's sealed
  • Generic CacheStorage Seal/Store method now checks for seal using isSealed function then seals.
    • Seal method uses session header (hashed) as the key
    • Session does not seal, so returns not ok.

Challenge Request Modifications

  • Challenges now search for a valid pocket node (if it was in the session) and grabs the node’s evidence store to store the challenge evidence

Relay Metric changes

  • If lean pocket is enabled, it will use a go routine for metric sending (due to concern of lock contention at high load) else it’ll default to pre lean pocket behavior

CLI Changes

  • Removal of lean suffix for get-validators
  • Added integration test for getting set validators

Tendermint Consensus Reactor changes

  • Renamed Synced to ConsensusReactorState, more accurately representing the call
  • Implemented HTTP-JSON RPC functionality for this method
  • Added unit test case for the new http call

Pocket Core Test changes

  • Fixed both test cases, ran a before (staging branch) and after test. Both versions passes/fails the same test cases.
  • Added unit test case for Pocket Nodes (adding / initing)
  • Modified all tests to at least use one PocketNode

Planned Testing changes (TBD, have a WIP branch (expand-testing) for this, but shouldn’t be a blocker for code completeness, this is not same as our QA adds suites)

  • Expand integration tests to support multiple nodes
  • Add integration test for handling relays for multiple pocket nodes
    • Integration test will send relays and then check the evidence store to see if it was properly populated
  • Add integration test for submitting claims / proofs for multiple pocket nodes
  • Add integration test for challenge request for multiple pocket nodes
  • Add integration test for listening to new block production for multiple pocket nodes

Known Bugs

  • New sessions will not be tagged with the proper node address

@nodiesBlade
Copy link
Collaborator Author

I opened up a new PR for Integration tests: #13 that is based off this branch, @luyzdeleon

Copy link
Contributor

@luyzdeleon luyzdeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nomenclature contentions, we're on the right path with the codebase. My only major contention with this PR right now is the changes to the Status function in the Tendermint node to ConsensusReactorStatus , which I couldn't find in the fork here: https://github.com/fiagdao/tendermint/commits/leanfeatherpocket.

privVal.Key.Address = pk.PubKey().Address()
pocketTypes.InitPVKeyFile(privVal.Key)
privVal := privval.GenFilePV(c.TmConfig.PrivValidatorKey, c.TmConfig.PrivValidatorState)
privVal.Key[0].PrivKey = pk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be privVal.Keys[0].PrivKey = pk?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

app/tendermint.go Outdated Show resolved Hide resolved
@@ -74,6 +74,7 @@ require (
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect
)

replace github.com/tendermint/tendermint => github.com/pokt-network/tendermint v0.32.11-0.20220420160934-de1729fc7dba
//replace github.com/tendermint/tendermint => github.com/pokt-network/tendermint v0.32.11-0.20220420160934-de1729fc7dba
replace github.com/tendermint/tendermint => ../tendermint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that because this is referencing a private repository it will be complicated to add the proper dependency, however I want to surface that this file should be pointing to the correct version of our Tendermint fork after your PR there is approved.

types/config.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Outdated Show resolved Hide resolved
x/pocketcore/module.go Outdated Show resolved Hide resolved
x/pocketcore/types/leanNode.go Outdated Show resolved Hide resolved
app/tendermint.go Outdated Show resolved Hide resolved
app/tendermint.go Outdated Show resolved Hide resolved
app/tendermint.go Outdated Show resolved Hide resolved
x/pocketcore/types/config.go Outdated Show resolved Hide resolved
x/pocketcore/types/pocketNode.go Outdated Show resolved Hide resolved
app/tendermint.go Outdated Show resolved Hide resolved
Copy link
Contributor

@luyzdeleon luyzdeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last batch of nomenclature changes, let's rename all these Legacy nomenclature to Global.

@nodiesBlade
Copy link
Collaborator Author

nodiesBlade commented Jul 8, 2022

@luyzdeleon Just one additional call out in regards to your earlier comment about siloed / shared evidence storages.

  • I agree we should look into having a shared store for evidences and sessions. Particuarly sessions can be shared, and there is always additional overhead with silo databases since we can't share the database buffers, memory, etc. The major concern which pushed towards a siloed DB is that whenever the evidence/session storage is accessed (via relays, challenges, etc), it locks down the entire database, which didn't seem to ideal for me for to pursue on the first iteration. This is definitely to be revisited and refractored by me once we get this release out and observe the limitations.

@nodiesBlade
Copy link
Collaborator Author

nodiesBlade commented Jul 10, 2022

I added in a new GET request endpoint called /v1/localnodes (open to changing it go under query paths if that's what you think it is @luyzdeleon )

This endpoint exposes the node(s) addresses under a pocket core client, and will aid in helping understanding our network consolidation and usage. This endpoint just like any other one is non critical to servicing, so users have the option to disable/enable it just like any other endpoint by whitelisting it.

There is no in memory caching involved into this response and we leave up to the user to cache the responses (perhaps on the reverse proxy level).

app/cmd/rpc/server.go Outdated Show resolved Hide resolved
app/cmd/rpc/query.go Outdated Show resolved Hide resolved
app/cmd/rpc/query.go Show resolved Hide resolved
x/pocketcore/types/pocketNode.go Outdated Show resolved Hide resolved
x/pocketcore/types/service.go Outdated Show resolved Hide resolved
app/cmd/rpc/server.go Outdated Show resolved Hide resolved
poktblade and others added 26 commits July 28, 2022 18:36
remove multiple goroutine spinup, add random sleep minimum for claim/proof
fix storing session
@bashiru98
Copy link

@PoktBlade runned into errors when trying to run this branch.
Screen Shot 2022-08-07 at 05 57 18 (2)

@bashiru98
Copy link

can you verify what the issue might be @PoktBlade

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

metrics for pocketcore_service_tokens_earned_for_ is not the amount of token earned
5 participants