Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thv/degree lowering hardening #325

Merged
merged 37 commits into from
Sep 25, 2024
Merged

Conversation

Sword-Smith
Copy link
Collaborator

Add tests that degree lowering and code generator for constraint evaluation is deterministic and does not change unintentionally.

Comment on lines 769 to 771
chosen_node_id: usize,
main_constraints_count: usize,
aux_constraints_count: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing three things of the same type is bad design. It is very easy to mess up the order of the arguments, which will lead to runtime failures or – worse – erroneous behavior, where instead the compiler could highlight incorrect use of the function. Introduce a new type instead (or use an existing one should one fit).

Copy link
Collaborator

@aszepieniec aszepieniec Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

66b8225 Introduces type aliases EvolvingMainConstraintsNumber and EvolvingAuxConstraintsNumber for the second two counts.

Personally, I think this change degrades code quality. Having the commit as part of the PR is the fastest way forward, because it either earns your checkmark directly or else contributes to convincing you to reject it.

My position, in case you are interested:

  • Types used once are mis-used.
  • The function is called in one place only, not counting the test. It is conceptually part of that same function; the only reason why it is factored out is to test it in isolation.
  • The counts serve to derive indices of new columns and are thus inherently usizes. Declaring another type for these variables suggests a different context applies, but I don't think it does. The reader's attention is diverted towards finding this missing context.
  • Messing up the order of the arguments is still possible. (If due to too-weak introduced types, please show by example how it ought to be done.)
  • Why not declare a new type for every argument in every function interface? I do think that's the logical conclusion of your train of thought; if not I wonder where it stops.
  • All of the above being said, I do think that NodeId could be a useful type but a) that doesn't resolve the problem completely; and b) that refactor is out of the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type aliases don't do what you suggest they do in 66b8225 – that's newtypes.

Taking position:

Types used once are mis-used.

I see help from the type system to write correct programs as beneficial in all cases.1

The function is called in one place only […].

I don't see how this justifies bad design.

[…] the […] reason why it is factored out is to test it in isolation.

Which is laudable and a decision I fully support.

The counts […] are […] inherently usizes.

Consider:

let _ = PrintConfig::new(true, true, false);
let _ = PrintConfig::new(Sided::Single, Colored::Colored, PrintSpeed::Slow);

Even though (for arguments sake) all these enums have only two variants, and are thus inherently bools, I strongly prefer the second version.2 In my opinion, the argument extends to numerical types.

Why not declare a new type for every argument in every function interface?

The concrete contract of a function as well the number of arguments it takes make a huge difference. In my_vec.remove(i), there's no room for screwing up the argument order.3 The same holds for my_vec.swap(i, j), and so having two arguments of the same type does not influence the method's usability. I don't have a hard-and-fast rule at hand as to where it is beneficial to introduce new types.

I do think that NodeId could be a useful type

Sounds like a good idea. I agree with it being out of scope for this PR.


All that said, I think testability completely justifies factoring out the function. I also think that you have inherited design decisions (and, worse, accidental design) that would require a ton of work to improve.4 We can write the argument sublist usize, usize, usize off as being part of overall improvable design, improving which is left as future work. One might argue that this constitutes building up technical debt,5 but I think an equally valid argument is that factoring out the method only highlights existing technical debt.

In my opinion, introducing a type like NumConstraints slightly mitigates the issues I have with the same-type arguments. Fully resolving all pain points I see would probably require a more fundamental redesign.

Footnotes

  1. Yes, I'm looking forward to your counter argument after having made such an absolute claim.

  2. Other designs than a function new() taking many arguments should be considered were PrintConfig actually being added. The types' names would need a rework, too.

  3. Technically, it's possible: Vec::remove(i, &mut my_vec). However, (a) the type checker will stop you, and (b) why would you write that, apart from trying to prove a point, which, fair enough.

  4. One strong indicator of this is that the two arguments in question are essentially internal tracking information. The function should not have to rely on externally supplied information for this.

  5. I'm sure you won't be surprised to hear that I'm not a fan of building up technical debt, and even less of doing it consciously.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified 66b8225 (now 14a0572) to use newtypes instead of type aliases. Thanks for the reference. Let's proceed with this.

I see help from the type system to write correct programs as beneficial in all cases.1

You already read a few general counter-arguments in my bullet point list. In response to this particular phrasing: in the eyes of one person, "help from the type system to write correct programs" is "having to fight the type system to write anything at all in exchange for a marginal assurance of correctness" in the eyes of another. There are programming languages whose type systems are powerful enough to produce a formal computer-verifiable proof of programs' correctness. If rust's type system had this feature, would you recommend using it everywhere and always? The alternative would be to find an optimum on the tradeoff curve interpolating between (a) getting stuff done and (b) correctness.

The function is called in one place only […].

I don't see how this justifies bad design.

To the extent that the partially elided sentence justifies deviating from your standards (I would obviously not use the B-word) the point is not so much that the function is used in a small number of places and therefore unimpactful; instead, the point made by the elided part is that the function is conceptually and semantically, inseparably part of the of the function that calls it. This relation makes the maintenance of a succinct and misuse-resistant interface moot at best and counter-productive at worst -- because if the algorithm changes then the "interface" will have to adapt.

Consider:

let _ = PrintConfig::new(true, true, false);
let _ = PrintConfig::new(Sided::Single, Colored::Colored, PrintSpeed::Slow);

Even though (for arguments sake) all these enums have only two variants, and are thus inherently bools, I strongly prefer the second version.2 In my opinion, the argument extends to numerical types.

There is no natural mapping from Single-Sided/Double-Sided or Colored/Monochrome or Slow/Fast to true/false. You can define a mapping and call it canonical but it would still be an arbitrary choice that preserves no structure. So I disagree with the position that these enums "are inherently bools" and would even argue that the second line is preferable to the first.

To bolster my position that the counts in question are inherently usizes, consider the necessary boilerplate following the definition of EvolvingMainConstraintNumber:

struct EvolvingMainConstraintsNumber(usize);
impl From<EvolvingMainConstraintsNumber> for usize {
    fn from(value: EvolvingMainConstraintsNumber) -> Self {
        value.0
    }
}

In fact, the counts are not used except through this from method. The type is not used in any algebraic sense.

Maybe that's the point and the essence of our difference. I want types to semantically reflect the underlying maths; whereas you want types to constrain control flow and interfaces.

In my_vec.remove(i), there's no room for screwing up the argument order.3 The same holds for my_vec.swap(i, j), and so having two arguments of the same type does not influence the method's usability.

There is no room for screwing up the argument order (footnote notwithstanding) but there is room for making the wrong selection from the variables of type usize that live in scope. In this case, binding the interface through custom defined types gives the false impression of correctness, which is worse than no impression at all.

triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Show resolved Hide resolved
triton-vm/src/table/master_table.rs Outdated Show resolved Hide resolved
triton-vm/src/table/master_table.rs Outdated Show resolved Hide resolved
triton-constraint-builder/Cargo.toml Outdated Show resolved Hide resolved
triton-constraint-builder/benches/degree_lowering.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
@aszepieniec aszepieniec marked this pull request as ready for review September 18, 2024 15:29
@aszepieniec
Copy link
Collaborator

Something funky is happening with the test table::master_table::tests::air_constraints_evaluators_have_not_changed which I can't explain. CI is failing with

expected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (14714205515170095309·x² + 13661591558076455036·x + 14115119063754412580)

whereas on my desktop computer I get

expected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (03346962278739234229·x² + 12572006219868624302·x + 10286558549229477187)

and both are obviously different from my laptop which produced the expected value. Any ideas as to what might cause this?

@Sword-Smith
Copy link
Collaborator Author

Sword-Smith commented Sep 18, 2024

On my desktop at home, I get the same as the CI machine observed:

expected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (14714205515170095309·x² + 13661591558076455036·x + 14115119063754412580)

@Sword-Smith
Copy link
Collaborator Author

Would it make sense to also add a proptest like node_substitution_is_complete_and_sound for a DualRowIndicator type?

@aszepieniec
Copy link
Collaborator

aszepieniec commented Sep 19, 2024

Correction: on my laptop I get this failure:

exoected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (03346962278739234229·x² + 12572006219868624302·x + 10286558549229477187)

which agrees with my desktop.

After rebasing in @Sword-Smith's faster compilation changes, I get this failure

expected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (14714205515170095309·x² + 13661591558076455036·x + 14115119063754412580)

which agrees with CI and @Sword-Smith's machine.

@aszepieniec
Copy link
Collaborator

For the record, applying just the first commit (9e5db866063a6527366be62616611d74db1ef416) of setting opt-level=3 generates this failure:

expected: (03756754445351585926·x² + 06769459618545093804·x + 09140558386905394900)
observed: (03346962278739234229·x² + 12572006219868624302·x + 10286558549229477187)

which agrees with the failure before setting the opt-level. So this compiler optimization does not seem to change anything.

Sword-Smith and others added 7 commits September 19, 2024 13:20
This commit puts the bulk of the loop body of the degree-lowering
loop into a separate function, anticipating tests that need it to
exist in isolation.

Co-authored-by: Alan Szepieniec <[email protected]>
Also, factor out default degree lowering info into function.

Co-authored-by: Alan Szepieniec <[email protected]>
This test catches the nondeterminism allowed when not sorting
nodes by their id.

This test would have caught the indeterminism that was present prior to
f230ba7.

Co-authored-by: Alan Szepieniec <[email protected]>
This commit adds a test to verify that the automatically generated code
for evaluating the constraints has not changed between (a) writing or
updating the test; and (b) running the test. This test spans a large
scope and therefore can catch a large class of errors, ranging from
non-determinism to unintentional changes in the constraint circuit.

Co-authored-by: Alan Szepieniec <[email protected]>
Function `substitute` --> `redirect_pointers` does little more than
redirecting pointers. It is one step among many in applying a
substitution. With this naming, the disclaimer in the docstring,
about not being able to substitute root nodes, falls into place.
Function `substitute_node_and_produce_constraint` --> `apply_substitution`
is responsible for three steps, all of which fall jointly under the
concept "substitution".
aszepieniec and others added 9 commits September 19, 2024 15:44
Tests both completeness and soundness.

Note that if an `unreachable!` macro is reached, then all hell breaks
loose. In this case the proptest framework starts shrinking the input.
We observed that shrinking generates *more complex* multicircuits, not
simpler ones.

Co-authored-by: Thorkil Værge <[email protected]>
To achieve this, nextest is run separately for every crate. On crate
`triton-constraint-builder` the flag `--all-targets` is dropped.

Also, the code coverage collector drops the flag `--all-targets`. Now
benchmarks do not count towards code covering tests. And in
particular, the degree-lowering benchmark is not invoked in this
process.
Dependencies are declared for the entire repository in the repo-wide
`Cargo.toml` and then activated for each crate individually in the
crate-specific `Cargo.toml` files. This architecture ensures that
whenever two crates depend on the same dependency, they will agree on
that dependency's version.
Also: reword some comments.
The boilerplate was in fact declaring the wrong type, resulting in the
out-of-domain evaluator never being called.
The number of constants do not have to match the number of requested
constants in the arbitrary implementation because of constant folding.
So when an expression is formed from two constants, it results in a new
constant (as it should) and not a BinOp expression.
This is achieved by splitting the repository-wide `nextest` command
into one `nextest` command for each crate. For crate
`triton-constraint-builder` the `--all-targets` flag is dropped.

Also, the flag `--all-targets` is dropped from the coverage data
collector. As a consequence, benchmarks are not treated as tests
for the purpose of computing code coverage. And in particular, the
degree lowering benchmark is skipped also.
Specifically, in `air_constraints_evaluators_have_not_changed` the
hardcoded expected value was wrong.

Also: add a comment explaining why PR-CI might fail through no fault
of the requester.
@aszepieniec
Copy link
Collaborator

@jan-ferdinand: All checks pass. Any further requested changes?

@aszepieniec
Copy link
Collaborator

BTW: I can confirm that all tests pass on my desktop machine as well.

Copy link
Member

@jan-ferdinand jan-ferdinand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test's top-level comment is very helpful to understand why the test is doing what it is doing, and why it is meaningful to have the test. The test itself is a very good addition.

triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Outdated Show resolved Hide resolved
triton-constraint-circuit/src/lib.rs Show resolved Hide resolved
Function `apply_substitution` used to take three `usize`s (among other arguments)
for different purposes. This is considered bad design because it is easy to mess
up the order of the arguments leading to runtime failures or erroneous behavior.
Declaring and using the type aliases leverages the compiler's type system to
combat misuse.
aszepieniec and others added 6 commits September 23, 2024 13:39
Communicates expectation better.

Co-authored-by: Ferdinand Sauer <[email protected]>
 - Use unicode characters in ASCII diagram
 - Fix typo in comment
 - Use double (not triple) slashes for comment about test mechanics
 - Drop implied comments

Co-authored-by: Ferdinand Sauer <[email protected]>
Better indentation. More readability.

Co-authored-by: Ferdinand Sauer <[email protected]>
@aszepieniec
Copy link
Collaborator

All comments are resolved now (I think).

@jan-ferdinand
Copy link
Member

jan-ferdinand commented Sep 23, 2024

All comments are resolved now (I think).

I'm still wondering about the following points:

  • Can the code in the various num_ functions be de-duplicated? (review comment)
  • Is there a copy-paste error in the docstring for num_nodes? (review comment)
  • Is the method how the various num_ things are computed relevant enough to be mentioned in the docstring? (review comment)
  • Can the hidden relationship between node types in the multicircuit strategy be removed by changing which node types are supplied? (review comment)

@aszepieniec
Copy link
Collaborator

I think I've answered all your questions. Any further remarks / requests?

- Favor early return over nesting
- Favor `borrow()` over `as_ref().borrow()`
- Put one-line statements on one line
@jan-ferdinand jan-ferdinand merged commit 667c910 into master Sep 25, 2024
5 checks passed
@jan-ferdinand jan-ferdinand deleted the thv/degree-lowering-hardening branch September 25, 2024 17:13
@jan-ferdinand
Copy link
Member

No further requests. Happy to have this test in. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants