Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For performance experiments, add flags to disable context pruning #943

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Chris-Hawblitzel
Copy link
Collaborator

Each of the three flags -V disable-prune -V disable-prune-primitives -V disable-prune-tuples disable some aspect of context pruning (or, from another perspective, generate context declarations eagerly rather than on demand). The three flags can be used individually or in combination.

Copy link
Collaborator

@parno parno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick parameter additions! It will be very interesting to measure how much pruning affects stability.

@yizhou7
Copy link
Contributor

yizhou7 commented Dec 15, 2023

Thanks for prompt update! I will try out some experiments this week.
However, I am wondering if it would it be fair to be un-pruning even more aggressively?
For example, it seems like the bodies of many functions are still invisible:

if is_vis && is_revealed && is_spec {

Please correct me if I am wrong:
disable-prune expands the reachable set of modules within the crate.
But then functions that are not within the current module remain bodyless.
So disable-prune might additionally introduce:

  • type checking axioms of these functions
  • pre/post conditions axioms for some (?)
  • body definition axioms for none (?)

I guess alternatively the question is, would it be fair to also include more of the latter two?
Again, thank you for such a fast update.

@Chris-Hawblitzel
Copy link
Collaborator Author

I updated the is_vis && is_revealed && is_spec logic so that disable_prune will also reach through opaque, non-revealed functions. It won't reach through functions that aren't visible to the current module (e.g. because their bodies are private), since these also wouldn't be visible in comparable module-based systems like F*. Even when the body is hidden, there should still be type checking axioms and/or pre/post-conditions.

@yizhou7
Copy link
Contributor

yizhou7 commented Dec 17, 2023

I updated the is_vis && is_revealed && is_spec logic so that disable_prune will also reach through opaque, non-revealed functions. It won't reach through functions that aren't visible to the current module (e.g. because their bodies are private), since these also wouldn't be visible in comparable module-based systems like F*. Even when the body is hidden, there should still be type checking axioms and/or pre/post-conditions.

I think after the change to is_vis && is_revealed && is_spec (the previous un-pruning seems to work fine), I ran into some issues. More specifically, I am trying this out on verus-nr (commit hash eab31882bdb9185565fd8f04cb6ab741d5ca7e26). Here is my limited understanding (quite possibly wrong):

  • the following exception arises when verifying exec::replica, which uses exec::rwlock.
  • it is attempting to call poly_function on FunX { path: PathX { krate: Some("lib"), segments: ["exec", "rwlock", "impl&%0", "view"] } }, which I am guessing is the view function of RwLockReadGuard
  • RwLockReadGuard is defined in exec::rwlock, but not exposed to exec::replica. So its variant is not found, while its view function is now made available.
verus-systems-code/nr # RUST_BACKTRACE=1 ~/verus/source/target-verus/release/verus --crate-type lib --log smt verus-nr/src/lib.rs -V disable-prune --verify-module exec::replica
note: verifying module exec::replica

thread '<unnamed>' panicked at vir/src/ast_util.rs:474:17:
internal error: missing variant RwLockReadGuard
stack backtrace:
   0: rust_begin_unwind
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
   1: core::panicking::panic_fmt
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
   2: vir::ast_util::<impl vir::ast::DatatypeX>::get_variant
   3: vir::poly::poly_expr
   4: vir::poly::poly_expr
   5: vir::poly::poly_expr
   6: vir::poly::poly_expr
   7: vir::poly::poly_expr
   8: vir::poly::poly_expr
   9: vir::poly::poly_expr
  10: vir::poly::poly_expr
  11: vir::poly::poly_function
  12: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  13: vir::poly::poly_krate_for_module
  14: rust_verify::verifier::Verifier::verify_bucket_outer
  15: std::panicking::try
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread '<unnamed>' panicked at rust_verify/src/verifier.rs:1844:29:
worker thread panicked
stack backtrace:
   0: std::panicking::begin_panic
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Following that train of thought, I made all visible datatypes include their bodies (verus, prune.rs) when disable_prune:

        if is_vis {
            if is_transparent || disable_prune{
                datatypes.push(d.clone());
            } else {
                let mut datatype = d.x.clone();
                datatype.variants = Arc::new(vec![]);
                datatypes.push(Spanned::new(d.span.clone(), datatype));
            }
        }

It seems like that allows me to get past poly_krate_for_module, but then I run into the following:

thread '<unnamed>' panicked at vir/src/sst_to_air.rs:129:29:
abstract datatype should be boxed
stack backtrace:
   0: std::panicking::begin_panic
   1: vir::sst_to_air::typ_to_air
   2: vir::func_to_air::func_bind_trig
   3: vir::datatype_to_air::datatype_or_fun_to_air_commands
   4: vir::datatype_to_air::datatypes_and_primitives_to_air
   5: rust_verify::verifier::Verifier::verify_bucket
   6: rust_verify::verifier::Verifier::verify_bucket_outer
   7: std::panicking::try
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread '<unnamed>' panicked at rust_verify/src/verifier.rs:1844:29:

The error is now in the typ_to_air conversion. I debug printed the failing typ, and it seems to be Datatype(PathX { krate: Some("vstd"), segments: ["atomic_ghost", "AtomicPredU64"] }, [TypParam("Pred")], []). I am not familiar with the state machine marco, should these atomic_type reach here at all?

This is what I have been able to figure out so far. For now I will run some experiments on top of the less aggressive changes and report back, thank you!

@yizhou7
Copy link
Contributor

yizhou7 commented Dec 22, 2023

Hello @Chris-Hawblitzel . Sorry for the delay. I was performing a major refactoring that blocked some experiments.
I collected the verus queries from 5 projects in the paper repo, and ran mariposa on verus generated queries.
the original version refers to queries generated without any disable-prune options on.
The ''bloat' version refers to the queries generated with the three disable-prune options,
but before the more aggressive change: #943 (comment)

High level summary of the experiments result...
with prune disabled:

  • there are a few more queries for each project (due to additional well-from checks I think)
  • queries get more (SMT level) assertions (53% ~ 178%)
  • queries take a little longer to run (29% ~ 60%), but still very fast
  • queries get a little more unstable

More detailed report attached. time all in seconds.

project         original assert        bloat assert    assert count    original verification    bloat verification    verification time
               counts (geomean)    counts (geomean)    increase (%)           time (geomean)        time (geomean)         increase (%)
-----------  ------------------  ------------------  --------------  -----------------------  --------------------  -------------------
v_ironfleet                 595                 936           57.16                     0.05                  0.08                45.43
v_mimalloc                  931                1455           56.29                     0.10                  0.12                29.17
v_noderep                   678                1040           53.29                     0.06                  0.08                36.12
v_pagetable                 514                 874           69.90                     0.06                  0.08                37.55
v_pmemlog                   272                 757          178.27                     0.04                  0.07                60.00

Mariposa results, using a 60 seconds timeout:
stability status original vs. bloat v_ironfleet

category original bloat
unstable 2 (0.55%) 1 (0.27%)
stable 361 (99.45%) 363 (99.73%)
unsolvable 0 (0.0%) 0 (0.0%)
total 363 364

stability status original vs. bloat v_mimalloc

category original bloat
unstable 12 (1.61%) 15 (2.02%)
stable 732 (98.39%) 729 (97.98%)
unsolvable 0 (0.0%) 0 (0.0%)
total 744 744

stability status original vs. bloat v_noderep

category original bloat
unstable 2 (0.78%) 4 (1.57%)
stable 253 (99.22%) 251 (98.43%)
unsolvable 0 (0.0%) 0 (0.0%)
total 255 255

stability status original vs. bloat v_pagetable

category original bloat
unstable 4 (1.21%) 6 (1.81%)
stable 325 (98.48%) 325 (97.89%)
unsolvable 1 (0.3%) 1 (0.3%)
total 330 332

stability status original vs. bloat v_pmemlog

category original bloat
unstable 0 (0.0%) 1 (0.83%)
stable 121 (100.0%) 120 (99.17%)
unsolvable 0 (0.0%) 0 (0.0%)
total 121 121

SMT assert count cdfs

verification time cdfs

ccing @parno

@yizhou7
Copy link
Contributor

yizhou7 commented Dec 22, 2023

Mariposa results, using a 3 seconds timeout:

stability status original vs. bloat v_ironfleet

category original bloat
unstable 9 (2.48%) 6 (1.65%)
stable 354 (97.52%) 358 (98.35%)
unsolvable 0 (0.0%) 0 (0.0%)
total 363 364

stability status original vs. bloat v_mimalloc

category original bloat
unstable 35 (4.7%) 39 (5.24%)
stable 709 (95.3%) 705 (94.76%)
unsolvable 0 (0.0%) 0 (0.0%)
total 744 744

stability status original vs. bloat v_noderep

category original bloat
unstable 3 (1.18%) 4 (1.57%)
stable 252 (98.82%) 251 (98.43%)
unsolvable 0 (0.0%) 0 (0.0%)
total 255 255

stability status original vs. bloat v_pagetable

category original bloat
unstable 9 (2.73%) 13 (3.92%)
stable 320 (96.97%) 318 (95.78%)
unsolvable 1 (0.3%) 1 (0.3%)
total 330 332

stability status original vs. bloat v_pmemlog

category original bloat
unstable 1 (0.83%) 1 (0.83%)
stable 120 (99.17%) 120 (99.17%)
unsolvable 0 (0.0%) 0 (0.0%)
total 121 121

@parno
Copy link
Collaborator

parno commented Dec 22, 2023

I would expect the 3 second timeout results to more accurately capture the "typical" Verus experience. In those results, I'm surprised to see that IronFleet becomes slightly more stable in bloated mode.

@utaal-b
Copy link
Contributor

utaal-b commented Feb 17, 2024

[triaging] What's the intent with the PR specifically? Is it waiting on some review? Or is it not intended to be merged?

@utaal-b utaal-b marked this pull request as draft March 10, 2024 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants