Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.

{Deprecated. Newer at PR#20} PoET 2.0 Consensus #12

Closed
wants to merge 1 commit into from

Conversation

askmish
Copy link

@askmish askmish commented May 23, 2018

Signed-off-by: Ashish Kumar Mishra [email protected]

@aludvik
Copy link
Contributor

aludvik commented May 24, 2018

Quick general comment, I would be careful with your usage of the word "valid". I believe it is good to distinguish between a block's validity status and block's commit status. A block's validity should only depend on deterministic computation, things like signature verification, structure verification, batch validity relative to a state hash, etc. It should not be possible for a block's validity status to change over time. A block's committed status however can and should change over time based on the state of the network and other block's that have been considered.

To be more specific, I would not use "valid" in this sentence in the "Early Arriving blocks" section: If the block is invalid i.e. it is an early-arriving block. I would refer to block's that are dropped immediately because there is something intrinsically wrong with them (like a bad WaitCertificate) as invalid and refer to block's that do not have anything wrong with them but which we are not committing as something else, possibly "pending" or just "valid but not committed".

1. Generate an ECDSA key pair. This pair is held in SGX memory and NEVER committed to disk for reuse subsequently.
2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX.
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS.
4. Check if there was a previous ECDSA key registration. If there was, verify if ‘K’ blocks have been added to the chain since the key’s registration. If K blocks have not passed, wait until K blocks have been added.
Copy link
Contributor

@dcmiddle dcmiddle May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if there was a previous registration by what? by the same EPID Pseudonym maybe?
If K means the same thing in PoET2 as PoET1 (maximum registration life) I don't think we want to wait k blocks, because if the process dies, that machine can't re-register for potentially a really long time.

2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX.
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS.
4. Check if there was a previous ECDSA key registration. If there was, verify if ‘K’ blocks have been added to the chain since the key’s registration. If K blocks have not passed, wait until K blocks have been added.
5. Register the ECDSA Public key and the IAS response with the ledger.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a better name for the keys. Consider using the PoET1 terminology, e.g. PPK. The fact that it's an ECDSA key is ancillary.

Copy link

@annechenette annechenette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice RFC! I have a few wording comments and consistency nitpicks.

This document details a new mechanism for the PoET algorithm that overcomes some of the challenges with the original algorithm.
The intended audience of this document is architects, developers and anyone who would like to get a better understanding of the details of the PoET 2.0 algorithm.

This RFC describes a new PoET 2.0 Consensus for supporting the SGX PSE-free machines.
Copy link

@annechenette annechenette May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "SGX PSE-free machines" could be interpreted two different ways: 1) Machines without SGX and without PSE, or 2) Machines with SGX but without PSE. My first interpretation was 1, so I was confused until later in the paper.

The next section uses the term "Platform Services". Would that work here? If so, this phrase would be more clear if it was something like this:

 SGX machines without Platform Services

or

 machines without SGX Platform Services 

Otherwise, add "Enclave (PSE)" to the end of the phrase.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a minor note: Best tech-writing practice is to use the full term "Intel® Software Guard Extensions (SGX)" here, because it's the first time the term is used. But if that makes the sentence confusing or too long, it's probably okay to use "SGX" here, because the full term is in the next section.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

In PoET 1.0, the PoET SGX enclave set a timer internally and would only release a WaitCertificate, once the timer elapsed. To be considered a candidate for consensus, a block (a.k.a Claim Block) would have to be accompanied by the corresponding WaitCertificate. This meant that peers receiving a Claim Block accompanied by the WaitCertificate, were assured that the block had not been forwarded prematurely, hence the phrase ‘Proof of Elapsed Time’.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: The term "claim block" is written inconsistently. At first, it's "Claim Block". The rest of the paper uses "claim block", but there's also two occurrences of "ClaimBlock" without the space.

Technical writers prefer lower case (claim block), because it's easier to read. Plus, fewer terms would have to change. But "Claim Block" is acceptable as long as it's consistent.

To handle clock drifts, it may become necessary to synchronize with a standard network clock. Details are out of scope for this version of the document.

## Block Creation
Once the algorithm is in a state to publish blocks (i.e. the C test passes), it starts the process of creating claim blocks with the aim of participating in the Consensus mechanism.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found "the C test" to be confusing. Is it the same thing as Step 6 in the previous section? ("Wait for ‘C’ blocks to be added ..."). Could you clarify here?

## Wait Certificate Creation
Once a claim block is created, the PoET algorithm requests the PoET enclave to create the WaitCertificate.

The contents of the WaitCertificate for PoET 2.0 are:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information is hard to read in paragraph form. I suggest using a bulleted list. For example:

 The contents of the WaitCertificate for PoET 2.0 are:

 * Duration: A random 256 bit number generated by SGX
 * WaitTime: The number of seconds to wait, determined by the LocalMean
 .
 .
 .

If you don't like bullets, you could simply add a blank line between each item so that each renders as its own paragraph.

The third choice is an RST term-definition list, but I always forget the syntax for that one.

Upon receiving a claim block and a WaitCertificate, the receiving validator does the following:
### Sanity checks
1. Ensure the Block and WaitCertificate signatures are valid
2. The Block’s ID matches the WaitCert’s blockID

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nitpick: I suggest spelling out "WaitCertificate" (here and elsewhere) instead of abbreviating it in a few places. Consistency is good. :-)

# Rationale and alternatives
[alternatives]: #alternatives

Poet 2.0 improves on PoET 1.0 adding support for PSE free devices. Current PoET 1.0 limits the availability of PoET consensus only to SGX PSE enabled devices.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with my first comment above, I suggest using the term "(SGX) devices with/without Platform Services" (or something similar), instead of "PSE-free" or "PSE-enabled".

# Motivation
[motivation]: #motivation

PoET 1.0 offers a very unique and efficient way to achieve consensus based on secure timers set inside an Intel® Software Guard Extensions (SGX) enclave. The secure timer and the monotonic counters used in PoET 1.0 rely on _SGX Platform Services_ to access these services from the system hardware. _SGX Platform Services_, unfortunately, are not yet available on all SGX enabled machines. While plans are in place for introducing these services universally across the processor lineup, it will take some time to release the services and for them to become ubiquitous. PoET 2.0 address these challenges by making the PoET algorithm independent of the _SGX Platform Services_.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PoET 2.0 addresses these challenges by making the PoET algorithm independent of the SGX Platform Services.

10. If the block is invalid i.e. it is an early-arriving block, compare the duration with the duration of the claim block. If the block duration is smaller, drop the claim block. Wait for the WC to catch up with the CC. If the claim block duration is less than the incoming block duration, discard the incoming block.
11. Proceed to step 6 once the block becomes valid.
### Fork Resolution
12. If the new block claims an earlier block as a parent, check for any cached blocks claiming the new block as a parent, retrieve them. Compute the new CC and check if the chain is valid (CC + E < WC)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not this be a recursive call until a terminal child is reached within the cache? There could be a fork comprising of multiple (>1 block) in the cache.

Copy link
Contributor

@vaporos vaporos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text should be wrapped at <80 columns like the existing RFCs.

I think generally this RFC is currently light on detail given the extremely complex nature of the topic. It assumes a working understanding of current PoET, but should probably fill those knowledge gaps as well so it provides a complete picture. Otherwise, a lot of assumptions need to be made by the reader.

When the PoET 2.0 module is initialized, it performs the following actions:
1. Generate an ECDSA key pair. This pair is held in SGX memory and NEVER committed to disk for reuse subsequently.
2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX.
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a regression, in that PoET 1.0 was specifically design to avoid talking to IAS frequently? In this sequence, a conversation with IAS is necessary between consensus engine restarts because the ECDSA key pair is not stored.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signup has to happen every time the process starts/system reboots based on the K-test, similar to what we have in poet 1.0.

The Wall Clock (WC) measures the time elapsed since the arrival of the ‘Sync Block’, i.e. the first block on the chain.
The Chain Clock (CC) measures the cumulative wait duration of all the blocks in the chain (each fork will have its own CC).
Upon receiving the Sync Block, WC and CC are initialized to 0. In practice, the WC may be calculated by recording the system time at the moment of the arrival of the Sync Block and subsequently subtracting this timestamp from the current time.
To handle clock drifts, it may become necessary to synchronize with a standard network clock. Details are out of scope for this version of the document.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this RFC requires all nodes to sync time across the network to get correct behavior, this needs to be addressed here in detail, since it is both non-trivial and probably not realistic. No such assumptions are made in the current version of PoET.

ValidatorID: The ID of the current Validator
Sign: The signature of the WaitCertificate computed over all the fields using the ECDSA private key

Once the WaitCertificate is created, the enclave stores the Last Block # and the WaitCertificate in the table created previously. The validator will only create a single wait certificate per block. This implies that if a fork containing fewer blocks becomes active, the validator will NOT be able to publish any more blocks until the new fork has added enough blocks to catch up to the previous chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What impact will this have on attempting to keep a PoET network running? Many of the current PoET tests are problematic in testing situations, or situations with low node counts; this would seem to fall into that category, and presumably would increase these problems. The existing PoET tests (checks would be a better word here) can essentially be disabled as necessary for testing purposes, but this sounds more fundamental.

Signed-off-by: Ashish Kumar Mishra <[email protected]>
@askmish askmish force-pushed the poet2 branch 2 times, most recently from 8403e7c to 4fc6c81 Compare June 14, 2018 10:20
11. Proceed to step 6 once the block becomes valid.
### Fork Resolution
12. If the new block claims an earlier block as a parent, check for any cached blocks claiming the new block as a parent, retrieve them. Compute the new CC and check if the chain is valid (CC + E < WC)
13. If valid, compare existing chain length with the new chain length. If existing chain length is more, discard new chain. If the new chain is longer, switch to the new chain (commit the new blocks)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about this fork resolution process. Can someone explain why we only need to compare chain lengths (rather than check the total number of "work" done on the chain--i.e., incorporate the population estimate)?

Here's my thought process: suppose I controlled a small group of nodes. We could could fork off everything to a side chain and then wait for the "difficulty" to go down. In the steady state, due to the population adjustment, we would be adding blocks at the same rate as the main chain. We could let this go on for a while--maybe we would get lucky and accumulate blocks at a slightly (sublinear) rate faster than the main chain. Then, at some point, we could add in a bunch of new members and use these to add much more blocks than expected and try to catch up to the main chain. It's possible there's some mathematical reasoning that says this is impossible, but it's not immediate (and needs to be written up if this is becomes the spec).

Does anyone have an explanation for why this doesn't work? Am I misunderstanding something here? Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartm The newer spec for poet 2.0 is at #20

@askmish askmish changed the title PoET 2.0 Consensus {Deprecated. Newer at PR#20] PoET 2.0 Consensus Sep 4, 2018
@askmish askmish changed the title {Deprecated. Newer at PR#20] PoET 2.0 Consensus {Deprecated. Newer at PR#20} PoET 2.0 Consensus Sep 4, 2018
@askmish askmish closed this Sep 4, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants