-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Side-effectful Nontermination #772
base: development
Are you sure you want to change the base?
Conversation
Updated expected values in LivenessTest for existing tests that now show liveness violations (from UNKNOWN to FAIL)
The existing liveness tests seem to be correct, and the change from UNKNOWN to FAIL was not caused by side-effectful nontermination but by point (2) I mentioned above. The relaxation of some barriers caused an assertion failure, which then caused some thread to abort and not release its lock, which was then treated as liveness issue. EDIT: Fixed the issue and reverted expected outcome of some tests. |
Fixed Nontermination verdict if nontermination was caused due to assertion failure. Updated expected outcomes of LivenessTest caused by the above change.
Updated expected values in LivenessTest for existing tests that now show liveness violations (from UNKNOWN to FAIL)
Fixed Nontermination verdict if nontermination was caused due to assertion failure. Updated expected outcomes of LivenessTest caused by the above change.
2150e68
to
5200c69
Compare
Can't we add a configurable (similar to several of our providers) that allows us to define a more fine-grained skip predicate and define the predicate accordingly based on the progress model? |
We could have skip lists per progress model, but can't we instead just comment out the tests in the expected files? Those files are already defined on a per-progress-model basis. |
Having skip list per model is not maintainable. If at some point we notice that e.g., the arch can also influence these things we need to do yet another level of splitting.
As long as we do not lose the information we currently have (e.g., removing the lines is a "no go"), it should be fine. |
Using providers to skip the tests would effectively also just be a skiplist per model, but just written in code (unless you find a way to skip tests based on something other than their name). |
Little update: this branch can give wrong verdicts on some simple cases where SCCP eliminates loop counter variables and the loop termination condition entirely.
The above program is wrongly considered non-terminating if unrolled insufficiently. EDIT: This is fixed now |
I suspect that the problem is not SCCP, but rather |
It's both actually. SCCP makes the assignments dead and even if they were not deleted I would like my code to not care about variables that are not live inside the loop (I have not implemented this yet, but I will eventually). |
…detection: this removes a wrong verdict related to SCCP removing the whole loop body. Added new test related to above issue. Minor updates to related code.
How necessary is it to support non-termination due to blocked |
The best would be to have support for non-termination for a given encoding of |
Then I will copy over whatever we are doing right now for blocked control barriers into the new non-termination detection. |
Yes, cause I didn't implement control barriers at all, i.e., there was not a single line of code handling them. Pure control-barrier nontermination actually requires nothing of the new theory: it is a standard reachability problem as no thread does anything anymore and thus is witnessed by standard finite executions. The most interesting type of non-termination would involve both barriers and loops: some threads get stuck in barriers while others are looping (possibly even with barriers inside the loop). |
Added consideration of ControlBarriers in non-termination (WIP)
I actually got the order of cmds wrong, the optimization should be done on the binary
|
…old version) - Deleted old LivenessEncoder code
… (asymmetric) non-termination. Cleaned up NonTerminationEncoder and updated comments.
…tation for better error reporting
I have implemented a (more or less) complete detection of loop non-termination. It can detect all liveness violations in the Vulkan litmus tests, no matter how intricate they are. The detection relies on a new instrumentation pass called Future work:
Issues:
EDIT: The Btw. this superimposed unrolling also gives a nice monotonicity property which ensures that the executions of a |
The second issue is to be expected since I re-enabled all tests in 40624e9. I can take care of this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have not checked the main class NonTerminationEncoder
(probably wont have time before the end of the week), but here are already some small comments.
dartagnan/src/main/java/com/dat3m/dartagnan/configuration/OptionNames.java
Outdated
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/EncodingContext.java
Outdated
Show resolved
Hide resolved
return new LivenessEncoder().encodeLivenessBugs(); | ||
private TrackableFormula encodeNontermination() { | ||
final BooleanFormula hasNontermination = new NonTerminationEncoder(context.getTask(), context).encodeNontermination(); | ||
return new TrackableFormula(context.getBooleanFormulaManager().not(LIVENESS.getSMTVariable(context)), hasNontermination); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use NON_TERMINATION
instead of LIVENESS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We specify all properties in the positive (e.g. program_spec
rather than spec_violation
), so I think it should be LIVENESS
. However, the way we check this is by encoding the negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not referring to the positive vs negative part, but rather about the term "liveness" ... at some point we even had some "deadlock/no-deadlock" terms and I thought that "termination" was maybe a more general one.
However, we use liveness for the property and the nice stdout msg when we find a violation, so for consistency we should probably keep it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we could use TERMINATION
as the property. It would be more precise than liveness, since the latter can be understood in many ways.
If I rename it, dependent tools like Vsyncer
might need updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take care of updating vsyncer once we release a new dartagnan and I change the used version there.
dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/DynamicSpinLoopDetection.java
Outdated
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/NonterminationDetection.java
Outdated
Show resolved
Hide resolved
final LoopAnalysis.LoopIterationInfo lastIter = loop.iterations().get(loop.iterations().size() - 1); | ||
final Event lastEvent = lastIter.getIterationEnd(); | ||
|
||
final Event bound = lastEvent.getSuccessor().getSuccessor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the unrolling guarantee that the bound even is add 2+ after the loop end? If so, document this assumption here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we guarantee nothing really. While unrolling itself guarantees this, every other pass may break it.
I can add a comment saying that this is a "naive" check. But even if this fails, the code remains correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong. The code can accidentally treat a not-fully unrolled loop as fully unrolled and skip its instrumentation. This can result in wrong verdicts for non-termination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why are we filtering loops before the instrumentation? Is it purely for performance reasons (i.e. not to instrument loops unnecessarily) or the instrumentation relies on loops "passing" this filtering?
- Can we instead search for the next bound event in the program and check if it belongs to the same loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The filtering is performance-only. No need to check bounded loops for non-termination.
- We can search for the next bound event, but there is no way to 100% know if it belongs to the same loop without tagging the bound events somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firs pass up to method encodeInfixSuffixDecomposition()
. Will continue during the week
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
- We assume that if the suffix is consistent with the infix and co/fr-fair, then it must be "strongly" | ||
consistent. This may not be true and we could possibly report a liveness violation that is not consistently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "strongly consistent" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a notion I have in my upcoming paper. I'm not sure how far I should explain the concepts (I cannot write the paper inside the comments :P).
Anyways, a suffix (extension) is strongly consistent if it is consistent with every possible prefix (modulo some properties known about the prefix). I can add that as a short explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I already added an explanation in the comments describing the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where? "(3) Suffix must be strongly fair/consistent:"? The word "strongly" is there, but I still do not understand what this is supposed to mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the explanation of what this means is given exactly by the points after :).
For the most part it boils down to the fact that the suffix cannot really interact with the prefix, even if it was consistent(!).
That is why we require "strong" consistency which guarantees that not only the finite execution we currently see is consistent but that it also stays consistent when we append the suffix repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guarantees that not only the finite execution we currently see is consistent but that it also stays consistent when we append the suffix repeatedly.
I find this much informative that the current We assume that if the suffix is consistent with the infix and co/fr-fair, then it must be "strongly" consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure it is. But at some point I'm going to write a whole paper in the comments :).
I will add one more line of explanation.
repeatable. Fixing this would require additional checks relative to the memory model, possibly requiring | ||
encoding of the memory model or native handling in CAAT. | ||
|
||
TODO: We have not considered uninit reads, i.e., reads without rf-edges in the implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the consequences of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought this through, but my gut-feeling is that a suffix could possibly read the initial value although it is not co-maximal (because there are no co-edges that witness non-maximality). This would result in a fr-edge from suffix into prefix and therefore violating strong consistency/fairness.
EDIT: Just to be clear about the consequence: We could report FAIL
although the non-terminating execution is actually unfair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could report FAIL although the non-terminating execution is actually unfair.
Please at this as the last sentence of the TODO
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Show resolved
Hide resolved
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
|
||
} | ||
|
||
// TODO: Check other types of events as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing to detect equivalence of other methods would just make this analysis less precise, right? I.e. we might miss to detect some non-termination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, currently it is the opposite: without the check it considers events equivalent rather than not equivalent.
For example, it will consider all fences equivalent, even if their tagging mismatches. The same is true for all other kinds of events.
dartagnan/src/main/java/com/dat3m/dartagnan/encoding/NonTerminationEncoder.java
Outdated
Show resolved
Hide resolved
@ThomasHaas please rebase and fix the conflicts |
# Conflicts: # dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/CoreCodeVerification.java # dartagnan/src/test/resources/VULKAN-Liveness-Fair-expected.csv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another pass up to "Strong suffix extension encoding"
|
||
void *thread(void *unused) | ||
{ | ||
__VERIFIER_loop_bound(5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are missing an #include dat3m.h
. Similar in other benchmarks
@@ -49,6 +49,7 @@ public class OptionNames { | |||
public static final String DYNAMIC_SPINLOOP_DETECTION = "program.processing.spinloops"; | |||
public static final String PROPAGATE_COPY_ASSIGNMENTS = "program.processing.propagateCopyAssignments"; | |||
public static final String REMOVE_ASSERTION_OF_TYPE = "program.processing.skipAssertionsOfType"; | |||
public static final String NONTERMINATION_INSTRUMENTATION = "program.processing.nonTerm"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already very verbose on the option names; why not program.processing.nonTermination
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
return bmgr.and(exitReached, bmgr.not(encodeBoundEventExec())); | ||
final BooleanFormula terminated = program.getThreadEventsWithAllTags(Tag.NONTERMINATION).stream() | ||
.map(CondJump.class::cast) | ||
.map(jump -> bmgr.not(bmgr.and(context.jumpCondition(jump), context.execution(jump)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bound events are also tagged as NONTERMINATION
and as their condition is trivially true (they are gotos), the previous encoding is supersed by this one, correct?
What about other events tagged by NONTERMINATION
? Is it always guaranteed that if they jumped, then they did not terminate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bound events are also tagged as NONTERMINATION and as their condition is trivially true (they are gotos), the previous encoding is supersed by this one, correct?
Yes.
What about other events tagged by NONTERMINATION? Is it always guaranteed that if they jumped, then they did not terminate?
Tag.NONTERMINATION
should designate - as its name says - executions that did not terminate (yet). If this is not the case, then the events are incorrectly marked. Currently, this is the case for loop bound events, spin loop events, and the events placed by the new pass.
That being said, we exclusively use it for loop-induced nontermination and not for barrier-induced one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is: in a jump that is tagged as non-terminated, non-termination requires (1) only execution (as with bound events), (2) execution + taking the jump, or (3) execution + not taking the jump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jump needs to be taken (which is always true for unconditional jumps that are executed).
return new LivenessEncoder().encodeLivenessBugs(); | ||
private TrackableFormula encodeNontermination() { | ||
final BooleanFormula hasNontermination = new NonTerminationEncoder(context.getTask(), context).encodeNontermination(); | ||
return new TrackableFormula(context.getBooleanFormulaManager().not(LIVENESS.getSMTVariable(context)), hasNontermination); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take care of updating vsyncer once we release a new dartagnan and I change the used version there.
final LoopAnalysis.LoopIterationInfo lastIter = loop.iterations().get(loop.iterations().size() - 1); | ||
final Event lastEvent = lastIter.getIterationEnd(); | ||
|
||
final Event bound = lastEvent.getSuccessor().getSuccessor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why are we filtering loops before the instrumentation? Is it purely for performance reasons (i.e. not to instrument loops unnecessarily) or the instrumentation relies on loops "passing" this filtering?
- Can we instead search for the next bound event in the program and check if it belongs to the same loop?
- Prefix stores must be co-before suffix stores | ||
- Only suffix reads can read from suffix stores, prefix/infix reads can not. | ||
- Suffix reads can only read from infix/suffix or from co-maximal stores in the prefix, | ||
if they access an address not stored to by infix/suffix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this line guaranteed by the fact that if you read from prefix, it is co-maximal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does depend on how you read the sentence. I meant that the stores are co-maximal within the prefix, which by itself does not exclude the possibility that there are stores in the infix/suffix that are co-before/after the prefix one.
With some detailed analysis of the cases, it is possible to flatout require that the suffix only reads from globally co-maximal prefix stores.
I exploit this fact in the encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have already a rather "fix/standard" meaning of what "co-maximal" means. I would rather say Suffix reads can only read from infix/suffix or from the last store in the prefix to the corresponding address
- We assume that if the suffix is consistent with the infix and co/fr-fair, then it must be "strongly" | ||
consistent. This may not be true and we could possibly report a liveness violation that is not consistently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where? "(3) Suffix must be strongly fair/consistent:"? The word "strongly" is there, but I still do not understand what this is supposed to mean
repeatable. Fixing this would require additional checks relative to the memory model, possibly requiring | ||
encoding of the memory model or native handling in CAAT. | ||
|
||
TODO: We have not considered uninit reads, i.e., reads without rf-edges in the implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could report FAIL although the non-terminating execution is actually unfair.
Please at this as the last sentence of the TODO
// FIXME: This is likely going to cause problems (wrong result) with executions where some thread is stuck in a loop | ||
// and another in a barrier. We could replace the code to check if ALL threads are stuck at a control-barrier | ||
// to avoid such wrong results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking that ALL threads are stuck sounds wrong to me. If we have two sets of threads, each group synchronizing on a different barrier, if one group properly sync and the other not, we still want to report a violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that we check if all non-terminated threads are stuck in a barrier. Then if one group properly synchronizes, it will terminate or get stuck in the next barrier.
The idea of checking that all non-terminated threads are stuck in a barrier is that we do explicitly avoid these problematic cases of mixed non-termination at the cost of completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now it is clearer, but it would still change to We could replace the code to check if ALL stuck threads are stuck at a control-barrier to avoid such wrong results.
|
||
// Encodes the basic properties of the infix-suffix matching relation. | ||
// The semantics of what a matching actually implies is done by the above methods. | ||
// NOTE: We do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken sentence
// Internal classes | ||
|
||
/* | ||
- A (possibly nonterminating) "Loop" contains a list of "NonterminationCase"s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can more than one iteration be non-terminating? Or is it that you consider any iteration after the non terminating also non terminating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not more than one iteration, but there can be multiple ways in which a loop does not terminate. There are different iterations that can trigger a non-teminating situation and even multiple possiblities within a single iteration.
For example, if some loop/iteration is conditionally side-effect-free, then it may fail to terminate due to side-effect-free spinning (spinloop-tagged jump event) or due to side-effect-ful non-progress (other, nontermination-tagged jump).
Implementation-wise, the idea is that a single loop has special jump events tagged by Tag.NONTERMINATION
to designate where non-termination could occur. Every such event is a "nontermination case" of the loop.
This PR adds a new, improved encoding that can detect some cases of side-effectful non-termination.
EDIT: The below description is not fully accurate anymore.
I added 5 new benchmarks to specifically test this, and updated the verdicts of the existing liveness tests that are now correctly identified as FAIL rather than UNKNOWN.
The encoding is functional and replaces the old encoding for deadlocks/liveness issues.
So feel free to test this branch on some real code.
Important notes:
The reasons that this PR is marked as DRAFT are:
The new encoding currently does also detect liveness issue that are caused by assertion failures, which the old one did not. This is easy to fix but we also have no test cases for this yet.The new encoding does not reason about non-termination due to control barriers at all. We have no tests involving non-termination due to CBs.LitmusVulkanFairLivenessTest
will fail. We would need to be able to enable tests only for certain progress models and skip them for others.EDIT: Fairness of spuriously failing events likeStoreExclusive
is not considered so far. We also do not have tests for this scenario.Because of points (1) and (4), I have not yet deleted the old code (it is unused though).