Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory profiler #1201

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Conversation

alv-around
Copy link
Contributor

@alv-around alv-around commented Jan 10, 2025

Here is a minimal implementation of memory profiler for the sdk (#1029) using the official prometheus-client (see also here).

Currently the PR does:

  • the profiler runs under the memory_profiler feature and provides a breakdown text of the memory consumption of each method once the sdk is dropped. bellow the test sample of running cargo run --example sdk_app --features memory_profiler :
cargo run --example sdk_app --features memory_profiler
   Compiling sysinfo v0.33.1
   Compiling openvm-sdk v0.2.0-alpha (/home/alv/code/openvm/crates/sdk)
    Finished `dev` profile [optimized + debuginfo] target(s) in 1m 02s
     Running `/home/alv/code/openvm/target/debug/examples/sdk_app`
cargo:rerun-if-env-changed=OPENVM_SKIP_BUILD
Using rustc: /home/alv/.rustup/toolchains/nightly-2024-10-30-x86_64-unknown-linux-gnu/bin/rustc
Building guest package: cargo +nightly-2024-10-30 build --target riscv32im-risc0-zkvm-elf -Z build-std=alloc,core,proc_macro,panic_abort,std -Z build-std-features=compiler-builtins-mem
cargo command: cargo +nightly-2024-10-30 build --target riscv32im-risc0-zkvm-elf -Z build-std=alloc,core,proc_macro,panic_abort,std -Z build-std-features=compiler-builtins-mem --manifest-path /home/alv/code/openvm/crates/sdk/guest/Cargo.toml --target-dir /home/alv/code/openvm/crates/sdk/guest/target --profile release
openvm-sdk-example-test: Starting build for riscv32im-risc0-zkvm-elf
openvm-sdk-example-test:     Finished `release` profile [optimized] target(s) in 0.61s
public values output: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# HELP memory usage memory used in process.
# TYPE memory usage gauge
memory usage{method="Build"} 6440038400
memory usage{method="Transpile"} 6436044800
memory usage{method="Commit"} 6471786496
memory usage{method="KeyGen"} 6456745984
memory usage{method="AggVerify"} 7193219072
memory usage{method="Execute"} 6435389440
memory usage{method="Proof"} 6766968832
# EOF
  • For the moment, I put the profiler behind the memory_profiler feature, so I don't need to touch existing code. I wanted to avoid making any change to the sdk interface, without getting more feedback.

Possible adjustments to the code:

  1. Create macro to take the measurements instead of calling in the middle of the method
  2. Change extend the interface of Sdk struct
  3. Other ways to display the information to the user
  4. Make registry thread-safe (??)
    ...

These I some ideas that I had, but I wanted to share them here before jumping into the implementation. Any other ideas not cover here are more than welcome.

EDIT:
Ok, I just saw this tip, I may come back and swap prometheus-client for metrics-process 🫠

@jonathanpwang
Copy link
Contributor

@alv-around very exciting! I'm not sure about prometheus-client vs metrics-process. I suggested the latter just because that's what I saw reth used.

I haven't looked into the PR yet, but I guess there are two different approaches:

  1. either you get profile info at specific points in time
  2. you have a separate async loop that periodically gets profile snapshots, and then you hook that up to a metrics recorder to either display or aggregate the info

It sounds like you currently went with (1)? It'd be nice if we could also support (2), or maybe feature flag between them somehow. I'll have more concrete suggestions after I look at the code.

also tagging @luffykai to take a look

@@ -148,6 +178,8 @@ impl Sdk {
VC::Periphery: Chip<SC>,
{
let app_prover = AppProver::new(app_pk.app_vm_pk.clone(), app_committed_exe);
#[cfg(feature = "memory_profiler")]
self.profiler.update_memory_usage(Method::Proof);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue is this only gets the memory usage at this point in time, but I think isn't able to capture peak memory usage during the overall function span?

Copy link
Contributor

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing progress so far!

Adding to my other comments after taking a skim: I think if possible I'd like something where: the memory profiling is a separate loop/thread, we collect metrics via recorder and the different sdk functions are scoped by labels (see https://github.com/openvm-org/openvm/blob/main/docs/crates/metrics.md), and then in some separate process/post-process in openvm-prof (crates/prof) we can set up some analyzer or grafana dashboard that then let's us put more concrete numbers to different workloads.

@alv-around
Copy link
Contributor Author

@alv-around very exciting! I'm not sure about prometheus-client vs metrics-process. I suggested the latter just because that's what I saw reth used.

I haven't looked into the PR yet, but I guess there are two different approaches:

1. either you get profile info at specific points in time

2. you have a separate async loop that periodically gets profile snapshots, and then you hook that up to a metrics recorder to either display or aggregate the info

It sounds like you currently went with (1)? It'd be nice if we could also support (2), or maybe feature flag between them somehow. I'll have more concrete suggestions after I look at the code.

also tagging @luffykai to take a look

@jonathanpwang yes! completely agree with your feedback. the last time I checked the issue was last year, and I was completely unaware of the process-metrics. The process-metrics way and your proposal, makes much more sense :)

Just a final notice, I would go for threads over async loops, as proving will block most likely the metrics measurement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants