Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply val_cnt and pop_cnt to pop_label() #710

Merged
merged 7 commits into from
Dec 27, 2024

Conversation

zhouwfang
Copy link
Member

@zhouwfang zhouwfang commented Dec 25, 2024

  • Apply val_cnt and pop_cnt.
  • Remove Label by introducing Frame::labels_cnt.

There is not much performance improvement on Linux due to this PR, which replaces

let values_cnt: usize = frame.labels[i ..].iter().map(|label| label.values_cnt).sum();

in pop_label() by pop_cnt + val_cnt.

#46

@zhouwfang zhouwfang requested a review from ia0 as a code owner December 25, 2024 19:07
@zhouwfang zhouwfang changed the title Apply val_cnt and pop_cnt in side table Apply val_cnt and pop_cnt to pop_label() Dec 25, 2024
side_table: &'m [SideTableEntry],
/// Total length of the value stack in the thread prior to this frame.
prev_stack: usize,
labels_cnt: usize,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ia0 Do you expect this to be removed eventually? Only from the performance perspective, I'm not sure how much gain there would be by removing it. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I expect this to be eventually removed (as well as push_label()). This is not only cleaner, but also useful for when we'll write a compiler to our own bytecode (or an ISA) because it means we don't need to compile instructions like Block or Loop that don't do anything except indicate a scope.

But this can be done in a separate PR.

ia0
ia0 previously approved these changes Dec 27, 2024
Copy link
Member

@ia0 ia0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That's a really nice cleanup. I'll push a small review commit and merge.

crates/interpreter/src/exec.rs Outdated Show resolved Hide resolved
side_table: &'m [SideTableEntry],
/// Total length of the value stack in the thread prior to this frame.
prev_stack: usize,
labels_cnt: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I expect this to be eventually removed (as well as push_label()). This is not only cleaner, but also useful for when we'll write a compiler to our own bytecode (or an ISA) because it means we don't need to compile instructions like Block or Loop that don't do anything except indicate a scope.

But this can be done in a separate PR.

@ia0 ia0 merged commit a877f75 into google:dev/fast-interp Dec 27, 2024
20 checks passed
@ia0
Copy link
Member

ia0 commented Dec 27, 2024

Before change:

Peak RAM usage: 6692
CoreMark result: 0.13914786 (in 144.034s)

After change:

Peak RAM usage: 6244
CoreMark result: 0.1541212 (in 130.039s)

@zhouwfang
Copy link
Member Author

zhouwfang commented Dec 27, 2024

@ia0 Could you also get the current code size? Thanks!

@ia0
Copy link
Member

ia0 commented Dec 28, 2024

@ia0 Could you also get the current code size? Thanks!

You just need to run rust-size ../../target/thumbv7em-none-eabi/release/wasm-bench (you might need cargo install cargo-binutils first) after ./run.sh nordic base. For this PR, it went from 144KiB to 141KiB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants