Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block Synchronization Kill Switch, integration in gas meter #15533

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gelash
Copy link
Contributor

@gelash gelash commented Dec 6, 2024

Request interrupt of speculatively executing transactions ASAP if block execution is halted.
interrupt_requested currently called from gas meter - should guarantee decent latency / unobservable overhead.
Added a new block executor test

@bchocho confirmed the performance impact on real blocks.

Copy link

trunk-io bot commented Dec 6, 2024

⏱️ 26m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 12m 🟩
check-dynamic-deps 5m 🟩🟩
rust-cargo-deny 4m 🟩🟩
rust-move-tests 3m 🟥
general-lints 53s 🟩🟩
semgrep/ci 43s 🟩🟩
file_change_determinator 23s 🟩🟩
permission-check 6s 🟩🟩
permission-check 5s 🟩🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
check-dynamic-deps 4m 1m +227%

settingsfeedbackdocs ⋅ learn more about trunk.io

@gelash gelash force-pushed the gelash/blocksynctrait branch from b421b79 to e73c084 Compare December 7, 2024 03:31
Copy link
Contributor

@ziaptos ziaptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how it's implemented in VM agnostic way, like no changes to the actual VM.

However some comments about naming and design in the code

/// For example, if the block execution has halted due to gas limit (committing only a prefix),
/// ongoing speculative executions can return early. Similarly, if a read-write dependency is
/// preficted between txns, it can allow the dependent txn to wait and avoid costly re-executions.
pub trait BlockSynchronizationView {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR over-engineers something that's really simple functionality. We just need to provide an abort (killswitch), so If going with a trait, call the trait KillSwichInterface (it's not really a view).

This BlockSynchornization name is very application specific, and doesn't really describe the desired functionality, but rather how it will be used.

Also early_abort_execution -> was_signaled_for_abort or something like that.

The comment above is more more confusing - like "Gives a global (from an executing txn perspective) view into the synchronization aspects of the encompassing block execution" seems like a lot of unnecessary info to just say provides an access to a boolean swich (or signal).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the over-engineering though, other than the name I don't see a shorter edit distance to getting functionality e2e?
If we do it this way however, I feel BlockSynchronizationView can have its merits: I highly doubt this will be the only similar functionality and we can have trait per desired method, or we could have a logical umbrella that provides the transaction a view into per-block context.
As for the name, I currently like @igor-aptos 's heartbeat most.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heartbeat sounds good, but I've seen interruption/termination point and killswich naming multiple times - and I think it better describes the intention (we are terminating something, and termination checks for that), whereas heartbeat is more like notifying others that we are still running. But I'd be OK with either.

My comment about overengineering was exactly about BlockSycnrhonizationView -> it only does one thing, so no reason to have a specific name, nor trying to encapsulate future functionality. If we end up with some set of traits that can be encapsulated as BlockSyncView, then we'll just define it as union of such traits in the future.

Having something that just a plain killswhich/heartbeat and implemented/documented as such is just much more straight forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yep, simple name/comment change would address my concerns (overengineering is in the comments/naming - attempting to be future proof)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Killswitch defines the implementation we currently want. But we might have a different functionality there later.

Heartbeat defines VM responsibility in calling this function without opinion on its implementation.

So I suggest calling the function heartbeat, and having the implementation of the trait potentially be BlockSynchronizationKillSwitch or something specific like that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want this natively supported in move VM, which is probably not a bad idea, I would make sure that the MoveVM implementation is agnostic to block STM, or other stuff. I would have a generic argument (which can be part of some other ones) that carries information on whether an interrupt was requested.

We can then define the granularity inside moveVM on which VM would check for a requested_interrupt and would act accordingly. I'd want that to be done generically so that in the case we don't want interrupts the compiler can compile out all the function calls.

I'd also have the API to be BlockSTM -> MoveVM calls request_interrupt or provides an implementation of it. MoveVM calls interrupt_requesteed at certain points (we can have multiple classes of points as well) and aborts with INTERRUPTED status.

Copy link
Contributor

@ziaptos ziaptos Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I like BlockSynchronizationKillSwich. but I don't think heartbeat defines what VM should be doing, it's more what BlockSTM is doing. VM reacts to a heartbeat, so has_heartbeat might be a better name. But again, a more generic interrupt_requested is much more descripting.

@@ -165,6 +170,16 @@ impl GasAlgebra for StandardGasAlgebra {
&mut self,
abstract_amount: impl GasExpression<VMGasParameters, Unit = InternalGasUnit> + Debug,
) -> PartialVMResult<()> {
if self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a conditional, a vtable lookup and a call - just mentioning :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to eliminate the conditional (to compile-time) - didn't do it for a draft.
function calls will happen, but calling via the dyn trait I don't know how to easily avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - that's why I said "i was just commenting" the full GasMeter design should be dynamically dispached much earlier, with different compile time implementations (like create a meter for specific settings and version, and avoid other dynamic check for a version that happen literally every time we charge gas :) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can provide a type parameter which implements BlockSynchronizationKillSwitch? This way we can have a NoopBlockSynchronizationKillSwitch and actual implementation. Also more clear than passing None?

@gelash gelash force-pushed the gelash/blocksynctrait branch from e73c084 to 10c8281 Compare January 17, 2025 03:21
@gelash gelash changed the title prototype integration in gas meter Block Synchronization Kill Switch, integration in gas meter Jan 17, 2025
@gelash gelash force-pushed the gelash/blocksynctrait branch from 10c8281 to b34f314 Compare January 17, 2025 03:36
@gelash gelash marked this pull request as ready for review January 17, 2025 03:36
@gelash gelash requested a review from ziaptos January 17, 2025 03:38
@@ -40,15 +41,18 @@ pub struct StandardGasAlgebra {

num_dependencies: NumModules,
total_dependency_size: NumBytes,

maybe_block_synchronization_view: Option<&'a dyn BlockSynchronizationKillSwitch>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe call the field maybe_block_synchronization_kill_switch like the type name? Also a comment why its here would be great for the future (otherwise not obvious why this is in gas algebra😄)?

@@ -165,6 +170,16 @@ impl GasAlgebra for StandardGasAlgebra {
&mut self,
abstract_amount: impl GasExpression<VMGasParameters, Unit = InternalGasUnit> + Debug,
) -> PartialVMResult<()> {
if self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can provide a type parameter which implements BlockSynchronizationKillSwitch? This way we can have a NoopBlockSynchronizationKillSwitch and actual implementation. Also more clear than passing None?

@@ -187,6 +202,16 @@ impl GasAlgebra for StandardGasAlgebra {
&mut self,
abstract_amount: impl GasExpression<VMGasParameters, Unit = InternalGasUnit>,
) -> PartialVMResult<()> {
if self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: duplicate with fragment above. With trait solution you could just do

block_synchronization_view.interrupt_requested()?;

or similar

@@ -51,6 +52,9 @@ pub trait ResourceGroupResolver {

pub trait AsExecutorView {
fn as_executor_view(&self) -> &dyn ExecutorView;

// TODO: remove once trait upcasting coercion is stabilized (rust issue #65991).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: maybe a github link to rust issues is better?

@@ -652,6 +652,10 @@ impl Scheduler {

!self.has_halted.swap(true, Ordering::SeqCst)
}

pub(crate) fn has_halted(&self) -> bool {
self.has_halted.load(Ordering::Relaxed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't relaxed too weak here? Although synchronizing on every gas meter call is not great

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants