For performance experiments, add flags to disable context pruning #943

Chris-Hawblitzel · 2023-12-14T23:27:22Z

Each of the three flags -V disable-prune -V disable-prune-primitives -V disable-prune-tuples disable some aspect of context pruning (or, from another perspective, generate context declarations eagerly rather than on demand). The three flags can be used individually or in combination.

parno

Thanks for the quick parameter additions! It will be very interesting to measure how much pruning affects stability.

yizhou7 · 2023-12-15T02:56:51Z

Thanks for prompt update! I will try out some experiments this week.
However, I am wondering if it would it be fair to be un-pruning even more aggressively?
For example, it seems like the bodies of many functions are still invisible:

verus/source/vir/src/prune.rs

Line 460 in 2b607f7

if is_vis && is_revealed && is_spec {

Please correct me if I am wrong:
disable-prune expands the reachable set of modules within the crate.
But then functions that are not within the current module remain bodyless.
So disable-prune might additionally introduce:

type checking axioms of these functions
pre/post conditions axioms for some (?)
body definition axioms for none (?)

I guess alternatively the question is, would it be fair to also include more of the latter two?
Again, thank you for such a fast update.

Chris-Hawblitzel · 2023-12-15T21:17:04Z

I updated the is_vis && is_revealed && is_spec logic so that disable_prune will also reach through opaque, non-revealed functions. It won't reach through functions that aren't visible to the current module (e.g. because their bodies are private), since these also wouldn't be visible in comparable module-based systems like F*. Even when the body is hidden, there should still be type checking axioms and/or pre/post-conditions.

yizhou7 · 2023-12-17T17:06:17Z

I updated the is_vis && is_revealed && is_spec logic so that disable_prune will also reach through opaque, non-revealed functions. It won't reach through functions that aren't visible to the current module (e.g. because their bodies are private), since these also wouldn't be visible in comparable module-based systems like F*. Even when the body is hidden, there should still be type checking axioms and/or pre/post-conditions.

I think after the change to is_vis && is_revealed && is_spec (the previous un-pruning seems to work fine), I ran into some issues. More specifically, I am trying this out on verus-nr (commit hash eab31882bdb9185565fd8f04cb6ab741d5ca7e26). Here is my limited understanding (quite possibly wrong):

the following exception arises when verifying exec::replica, which uses exec::rwlock.
it is attempting to call poly_function on FunX { path: PathX { krate: Some("lib"), segments: ["exec", "rwlock", "impl&%0", "view"] } }, which I am guessing is the view function of RwLockReadGuard
RwLockReadGuard is defined in exec::rwlock, but not exposed to exec::replica. So its variant is not found, while its view function is now made available.

verus-systems-code/nr # RUST_BACKTRACE=1 ~/verus/source/target-verus/release/verus --crate-type lib --log smt verus-nr/src/lib.rs -V disable-prune --verify-module exec::replica
note: verifying module exec::replica

thread '<unnamed>' panicked at vir/src/ast_util.rs:474:17:
internal error: missing variant RwLockReadGuard
stack backtrace:
   0: rust_begin_unwind
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
   1: core::panicking::panic_fmt
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
   2: vir::ast_util::<impl vir::ast::DatatypeX>::get_variant
   3: vir::poly::poly_expr
   4: vir::poly::poly_expr
   5: vir::poly::poly_expr
   6: vir::poly::poly_expr
   7: vir::poly::poly_expr
   8: vir::poly::poly_expr
   9: vir::poly::poly_expr
  10: vir::poly::poly_expr
  11: vir::poly::poly_function
  12: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  13: vir::poly::poly_krate_for_module
  14: rust_verify::verifier::Verifier::verify_bucket_outer
  15: std::panicking::try
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread '<unnamed>' panicked at rust_verify/src/verifier.rs:1844:29:
worker thread panicked
stack backtrace:
   0: std::panicking::begin_panic
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Following that train of thought, I made all visible datatypes include their bodies (verus, prune.rs) when disable_prune:

        if is_vis {
            if is_transparent || disable_prune{
                datatypes.push(d.clone());
            } else {
                let mut datatype = d.x.clone();
                datatype.variants = Arc::new(vec![]);
                datatypes.push(Spanned::new(d.span.clone(), datatype));
            }
        }

It seems like that allows me to get past poly_krate_for_module, but then I run into the following:

thread '<unnamed>' panicked at vir/src/sst_to_air.rs:129:29:
abstract datatype should be boxed
stack backtrace:
   0: std::panicking::begin_panic
   1: vir::sst_to_air::typ_to_air
   2: vir::func_to_air::func_bind_trig
   3: vir::datatype_to_air::datatype_or_fun_to_air_commands
   4: vir::datatype_to_air::datatypes_and_primitives_to_air
   5: rust_verify::verifier::Verifier::verify_bucket
   6: rust_verify::verifier::Verifier::verify_bucket_outer
   7: std::panicking::try
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread '<unnamed>' panicked at rust_verify/src/verifier.rs:1844:29:

The error is now in the typ_to_air conversion. I debug printed the failing typ, and it seems to be Datatype(PathX { krate: Some("vstd"), segments: ["atomic_ghost", "AtomicPredU64"] }, [TypParam("Pred")], []). I am not familiar with the state machine marco, should these atomic_type reach here at all?

This is what I have been able to figure out so far. For now I will run some experiments on top of the less aggressive changes and report back, thank you!

yizhou7 · 2023-12-22T00:01:12Z

Hello @Chris-Hawblitzel . Sorry for the delay. I was performing a major refactoring that blocked some experiments.
I collected the verus queries from 5 projects in the paper repo, and ran mariposa on verus generated queries.
the original version refers to queries generated without any disable-prune options on.
The ''bloat' version refers to the queries generated with the three disable-prune options,
but before the more aggressive change: #943 (comment)

High level summary of the experiments result...
with prune disabled:

there are a few more queries for each project (due to additional well-from checks I think)
queries get more (SMT level) assertions (53% ~ 178%)
queries take a little longer to run (29% ~ 60%), but still very fast
queries get a little more unstable

More detailed report attached. time all in seconds.

project         original assert        bloat assert    assert count    original verification    bloat verification    verification time
               counts (geomean)    counts (geomean)    increase (%)           time (geomean)        time (geomean)         increase (%)
-----------  ------------------  ------------------  --------------  -----------------------  --------------------  -------------------
v_ironfleet                 595                 936           57.16                     0.05                  0.08                45.43
v_mimalloc                  931                1455           56.29                     0.10                  0.12                29.17
v_noderep                   678                1040           53.29                     0.06                  0.08                36.12
v_pagetable                 514                 874           69.90                     0.06                  0.08                37.55
v_pmemlog                   272                 757          178.27                     0.04                  0.07                60.00

Mariposa results, using a 60 seconds timeout:
stability status original vs. bloat v_ironfleet

category	original	bloat
unstable	2 (0.55%)	1 (0.27%)
stable	361 (99.45%)	363 (99.73%)
unsolvable	0 (0.0%)	0 (0.0%)
total	363	364

stability status original vs. bloat v_mimalloc

category	original	bloat
unstable	12 (1.61%)	15 (2.02%)
stable	732 (98.39%)	729 (97.98%)
unsolvable	0 (0.0%)	0 (0.0%)
total	744	744

stability status original vs. bloat v_noderep

category	original	bloat
unstable	2 (0.78%)	4 (1.57%)
stable	253 (99.22%)	251 (98.43%)
unsolvable	0 (0.0%)	0 (0.0%)
total	255	255

stability status original vs. bloat v_pagetable

category	original	bloat
unstable	4 (1.21%)	6 (1.81%)
stable	325 (98.48%)	325 (97.89%)
unsolvable	1 (0.3%)	1 (0.3%)
total	330	332

stability status original vs. bloat v_pmemlog

category	original	bloat
unstable	0 (0.0%)	1 (0.83%)
stable	121 (100.0%)	120 (99.17%)
unsolvable	0 (0.0%)	0 (0.0%)
total	121	121

SMT assert count cdfs

verification time cdfs

ccing @parno

yizhou7 · 2023-12-22T00:22:33Z

Mariposa results, using a 3 seconds timeout:

stability status original vs. bloat v_ironfleet

category	original	bloat
unstable	9 (2.48%)	6 (1.65%)
stable	354 (97.52%)	358 (98.35%)
unsolvable	0 (0.0%)	0 (0.0%)
total	363	364

stability status original vs. bloat v_mimalloc

category	original	bloat
unstable	35 (4.7%)	39 (5.24%)
stable	709 (95.3%)	705 (94.76%)
unsolvable	0 (0.0%)	0 (0.0%)
total	744	744

stability status original vs. bloat v_noderep

category	original	bloat
unstable	3 (1.18%)	4 (1.57%)
stable	252 (98.82%)	251 (98.43%)
unsolvable	0 (0.0%)	0 (0.0%)
total	255	255

stability status original vs. bloat v_pagetable

category	original	bloat
unstable	9 (2.73%)	13 (3.92%)
stable	320 (96.97%)	318 (95.78%)
unsolvable	1 (0.3%)	1 (0.3%)
total	330	332

stability status original vs. bloat v_pmemlog

category	original	bloat
unstable	1 (0.83%)	1 (0.83%)
stable	120 (99.17%)	120 (99.17%)
unsolvable	0 (0.0%)	0 (0.0%)
total	121	121

parno · 2023-12-22T02:15:33Z

I would expect the 3 second timeout results to more accurately capture the "typical" Verus experience. In those results, I'm surprised to see that IronFleet becomes slightly more stable in bloated mode.

utaal-b · 2024-02-17T21:17:18Z

[triaging] What's the intent with the PR specifically? Is it waiting on some review? Or is it not intended to be merged?

For performance experiments, add flags to disable context pruning

3538f08

Chris-Hawblitzel requested review from parno and yizhou7 December 14, 2023 23:27

parno approved these changes Dec 15, 2023

View reviewed changes

Add disable_prune to is_revealed

fb2c096

utaal-b marked this pull request as draft March 10, 2024 08:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For performance experiments, add flags to disable context pruning #943

For performance experiments, add flags to disable context pruning #943

Chris-Hawblitzel commented Dec 14, 2023

parno left a comment

yizhou7 commented Dec 15, 2023

Chris-Hawblitzel commented Dec 15, 2023

yizhou7 commented Dec 17, 2023

yizhou7 commented Dec 22, 2023 •

edited by utaal-b

Loading

yizhou7 commented Dec 22, 2023 •

edited

Loading

parno commented Dec 22, 2023

utaal-b commented Feb 17, 2024

For performance experiments, add flags to disable context pruning #943

Are you sure you want to change the base?

For performance experiments, add flags to disable context pruning #943

Conversation

Chris-Hawblitzel commented Dec 14, 2023

parno left a comment

Choose a reason for hiding this comment

yizhou7 commented Dec 15, 2023

Chris-Hawblitzel commented Dec 15, 2023

yizhou7 commented Dec 17, 2023

yizhou7 commented Dec 22, 2023 • edited by utaal-b Loading

yizhou7 commented Dec 22, 2023 • edited Loading

parno commented Dec 22, 2023

utaal-b commented Feb 17, 2024

yizhou7 commented Dec 22, 2023 •

edited by utaal-b

Loading

yizhou7 commented Dec 22, 2023 •

edited

Loading