Support alloc-free dev #794

CapZTr · 2025-01-21T09:13:48Z

No description provided.

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas

On a more general note about handling Free/Alloc: I'm not convinced that changing aliasing is the right concept for Alloc/Free. I think it is a misleading road.
What you really care about is not the individual addresses that the Alloc/Free targets, but only the memory objects as a whole. Similarly, you care about the objects a memory access can address if you want to talk about bugs like use-after-free.
So a canAccessSameObject(x, y) method would be more appropriate and would also elevate the issue of requiring allocations of known size (the size does not matter!).
If you then introduce a corresponding sameObj relation in .cat, you should be able to treat Free and Alloc more-or-less as single-address events (i.e., simple memory events).
A use-after-free (or racy free) would then simply be ~empty ([Free];sameObj;[M] \ hb).

ThomasHaas · 2025-01-21T10:12:44Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AliasAnalysis.java

+    boolean mustAlias(Alloc a, MemoryCoreEvent e);
+
+    boolean mayAlias(Alloc a, MemoryCoreEvent e);
+
+    boolean mustAlias(Alloc a, MemFree f);
+
+    boolean mayAlias(Alloc a, MemFree f);
+
+    boolean mustAlias(MemFree a, MemFree b);
+
+    boolean mayAlias(MemFree a, MemFree b);


I know this is a draft, but you don't really want to go with so many overloads, do you?

natgavrilenko · 2025-01-21T10:41:44Z

On a more general note about handling Free/Alloc: I'm not convinced that changing aliasing is the right concept for Alloc/Free. I think it is a misleading road. What you really care about is not the individual addresses that the Alloc/Free targets, but only the memory objects as a whole. Similarly, you care about the objects a memory access can address if you want to talk about bugs like use-after-free. So a canAccessSameObject(x, y) method would be more appropriate and would also elevate the issue of requiring allocations of known size (the size does not matter!). If you then introduce a corresponding sameObj relation in .cat, you should be able to treat Free and Alloc more-or-less as single-address events (i.e., simple memory events). A use-after-free (or racy free) would then simply be ~empty ([Free];sameObj;[M] \ hb).

I think you didn't really check what the code is doing. We need both, individual addresses (i.e. the pointer returned by alloc) and the full allocated memory region to compare with addresses of memory accesses.

ThomasHaas · 2025-01-21T10:50:23Z

Why do you need the individual addresses in the full region if you instead had a accessesSameObject(x, y) method? This would hold true for an Alloc that is considered accessing the memory object it allocates and a memory access to anywhere inside that object.
Btw. the most compact representation of the memory region of a memory object is just the memory object itself, rather than the individual addresses inside. This makes the approach viable even for objects of unknown/unbounded size.

natgavrilenko · 2025-01-21T10:55:50Z

Why do you need the individual addresses in the full region if you instead had a accessesSameObject(x, y) method? This would hold true for an Alloc that is considered accessing the memory object it allocates and a memory access to anywhere inside that object. Btw. the most compact representation of the memory region of a memory object is just the memory object itself, rather than the individual addresses inside. This makes the approach viable even for objects of unknown/unbounded size.

please read the code first

ThomasHaas · 2025-01-21T10:59:29Z

Huh, I read the code?! I can see that all the checks about bounded size allocations and mayAlias(Alloc x , MemoryCoreEvent y) checking that y accesses something inside x. What else should I read in this PR?

hernanponcedeleon · 2025-01-21T19:37:20Z

I'm not convinced that changing aliasing is the right concept for Alloc/Free. I think it is a misleading road ... So a canAccessSameObject(x, y) method would be more appropriate

Isn't the later some kind of alias (maybe not as fine grained as per address, but at least per object)?

and would also elevate the issue of requiring allocations of known size (the size does not matter!).

This is assuming no OOB, right?

If you then introduce a corresponding sameObj relation in .cat, you should be able to treat Free and Alloc more-or-less as single-address events (i.e., simple memory events). A use-after-free (or racy free) would then simply be ~empty ([Free];sameObj;[M] \ hb).

Isn't this kind of what the two new relations allocptr and allocmem are doing? Those relation are not visible in the diff of this PR (which target the initial draft from Natalia), but maybe this is the source of the misunderstanding.

ThomasHaas · 2025-01-21T21:33:29Z

I'm not convinced that changing aliasing is the right concept for Alloc/Free. I think it is a misleading road ... So a canAccessSameObject(x, y) method would be more appropriate

Isn't the later some kind of alias (maybe not as fine grained as per address, but at least per object)?

Yes, it is a kind of aliasing. But I think it is worth to differentiate between the concepts of "same address", "overlapping address" and "same object". Covering them all under the term "aliasing" is bound to cause confusion.
Importantly, sameObject is actually easier to compute/reason about because we only ever have finitely many objects but possibly unboundedly many addresses.
For example, our newest alias analysis was not updated in this PR which I think is only partly because of its difficulty, but also partly because it does not compute explicit addresses like the other alias analyses do. However, the analysis does have information about which memory objects a memory event may access, and this is sufficient to implement the desired feature.
That being said, the implementation of canAccessSameObject for the other alias analyses would pretty much coincide with what this PR does.

and would also elevate the issue of requiring allocations of known size (the size does not matter!).

This is assuming no OOB, right?

All alias analyses assume no OOB anyways. So nothing changes there.

If you then introduce a corresponding sameObj relation in .cat, you should be able to treat Free and Alloc more-or-less as single-address events (i.e., simple memory events). A use-after-free (or racy free) would then simply be ~empty ([Free];sameObj;[M] \ hb).

Isn't this kind of what the two new relations allocptr and allocmem are doing? Those relation are not visible in the diff of this PR (which target the initial draft from Natalia), but maybe this is the source of the misunderstanding.

Yes, I know about those relations. I'm saying that you can likely replace them by a general sameObj relation. At least concept-wise, this is worth considering even if the multiple new base relations are preferred for some other reason. Either way, the underlying idea of those relations is still based more on object-based reasoning rather than address-based reasoning.
Lastly, encoding sameObj can be easier than sameAddress, especially if we eventually use a provenance-based pointer model like mentioned in #793 .

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AliasAnalysis.java

natgavrilenko · 2025-01-23T16:35:21Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AliasAnalysis.java

+        public boolean mayAlias(MemFree a, MemFree b) {
+            return a1.mayAlias(a, b) && a2.mayAlias(a, b);
+        }
+
        @Override
        public Graphviz getGraphVisualization() {


We will also need to add the new events to the graph

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/FieldSensitiveAndersen.java

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AliasAnalysis.java

Signed-off-by: Tianrui Zheng <[email protected]>

natgavrilenko · 2025-01-31T15:07:42Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/NativeRelationAnalysis.java

@@ -774,14 +774,24 @@ public MutableKnowledge visitAllocPtr(AllocPtr aref) {
            MutableEventGraph must = new MapEventGraph();


The TODOs can be removed now from both relations.

Could you also rename parameter aref -> allocPtr and aloc -> allocMem to match the other names in the class? I forgot to rename them after changing relation names.

natgavrilenko · 2025-01-31T15:07:50Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/NativeRelationAnalysis.java

@@ -774,14 +774,24 @@ public MutableKnowledge visitAllocPtr(AllocPtr aref) {
            MutableEventGraph must = new MapEventGraph();
            for (Alloc e1 : program.getThreadEvents(Alloc.class)) {
                if (e1.isHeapAllocation()) {


We can merge these two loops info one, something like this
List allocEvents = program.getThreadEvents(Alloc.class).stream().filter(a -> a.isHeapAllocation()).toList();
List freeEvents = program.getThreadEvents(MemFree.class);
Stream.concat(allocEvents.stream(), freeEvents.stream()).forEach(e1 -> freeEvents.forEach(e2 -> {
...
}));

natgavrilenko · 2025-01-31T16:03:40Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/NativeRelationAnalysis.java

@@ -795,8 +805,13 @@ public MutableKnowledge visitAllocMem(AllocMem aloc) {
            MutableEventGraph must = new MapEventGraph();
            for (Alloc e1 : program.getThreadEvents(Alloc.class)) {
                if (e1.isHeapAllocation()) {
-                    for (Event e2 : program.getThreadEvents(MemoryEvent.class)) {
-                        may.add(e1, e2);
+                    for (MemoryCoreEvent e2 : program.getThreadEvents(MemoryCoreEvent.class)) {


Could you create a list of MemoryCoreEvent before the first loop, so that we don't need to call getThreadEvents each time?

natgavrilenko · 2025-01-31T16:05:01Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/NativeRelationAnalysis.java

@@ -795,8 +805,13 @@ public MutableKnowledge visitAllocMem(AllocMem aloc) {
            MutableEventGraph must = new MapEventGraph();
            for (Alloc e1 : program.getThreadEvents(Alloc.class)) {
                if (e1.isHeapAllocation()) {


I think it should be easy to extend this approach to stack events, but I guess you want to get the heap part ready first.

natgavrilenko · 2025-01-31T16:21:42Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

-            Set<Location> target = targets.get(address);
-            addresses = target != null ? target : getAddresses(address);
+        if (addrExpr instanceof Register register) {
+            Set<Location> target = targets.get(register);


Can be the same addrExpr (without cast)

natgavrilenko · 2025-01-31T16:42:34Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

-        return getMaxAddressSet(e).stream().anyMatch(
-                l -> l.base.equals(a.getAllocatedObject()) && l.offset < getAllocatedSize(a)
-        );
+    public boolean mayAlias(Event a, Event b) {


I think this typecheck is a bit on overkill. We have two cases:

A pair of alloc and memory event -> we need to check if memory event may/must access any address from the allocated region.

Everything else, including alloc-free and free-free pairs -> we need to check the "normal" may/must alias for a single pointer.

How about something like this?

@OverRide
public boolean mayAlias(Event a, Event b) {
if (a instanceof Alloc alloc && b instanceof MemoryCoreEvent mem) {
return mayAccessAllocatedBy(alloc, mem);
}
if (b instanceof Alloc alloc && a instanceof MemoryCoreEvent mem) {
// This case shouldn't be called because alloc->mem relation always starts at alloc, we can keep both to be on the safe side.
return mayAccessAllocatedBy(alloc, mem);
}
return mayAccessSameAddress(a, b);
}

And the same for must sets and the other analysis classes.

natgavrilenko · 2025-01-31T16:44:54Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

-    @Override
-    public boolean mayAlias(MemFree a, MemFree b) {
-        return !Sets.intersection(getFreedAddresses(a), getFreedAddresses(b)).isEmpty();
+    private boolean mayAccessAllocatedBy(Alloc a, Event e) {


I would keep only the first part and do alloc-free pair in may/mustAccessSameAddress.

natgavrilenko · 2025-01-31T16:48:50Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AndersenAliasAnalysis.java

        }
    }

    private void processAllocs(Alloc a) {
-        if (!a.isHeapAllocation()) {
-            return;
-        }
        Register r = a.getResultRegister();
        if (a.getAllocationSize() instanceof IntLiteral i) {


We can add alloc to eventAddressSpaceMap and then check alloc-free pair via getMaxAddressSet:

eventAddressSpaceMap.put(a, Set.of(new Location(a.getAllocatedObject(), 0)));

natgavrilenko · 2025-01-31T16:57:53Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/EqualityAliasAnalysis.java

+        throw new IllegalArgumentException("Unsupported event types for EqualityAliasAnalysis");
+    }
+
+    private boolean mustAccessSameAddress(MemoryCoreEvent a, MemoryCoreEvent b) {


I think we don't need all these complex reasoning in mayAlias/mustAlias methods. In the old comment, I meant that we can implement mustAccessAllocatedBy(Alloc a, MemoryCoreEvent e) following the same logic as for mustAccessSameAddress, i.e. must access is true if 1) the memory event uses the register of the alloc event and 2) the register value has not been overwritten.

We can also try reasoning about register + index accesses, but it will be a bit more complex, because we also need to consider sizes of the member elements and offsets.

natgavrilenko · 2025-01-31T17:05:43Z

.../src/main/java/com/dat3m/dartagnan/program/analysis/alias/InclusionBasedPointerAnalysis.java

        final DerivedVariable vx = addressVariables.get(x);
        final DerivedVariable vy = addressVariables.get(y);
        return vx != null && vy != null && vx.base == vy.base && vx.modifier.offset == vy.modifier.offset &&
                isConstant(vx.modifier) && isConstant(vy.modifier);
    }

+    private boolean mayAccessAllocatedBy(Alloc a, Event e) {


I will carefully check it and add comments a bit later.

xeren · 2025-02-05T13:45:12Z

You may add the following methods to AliasAnalysis and use them instead:

boolean mayObjectAlias(Event a, Event b);
boolean mustObjectAlias(Event a, Event b);

Implementing it in InclusionBasedPointerAnalysis:

@Override
public boolean mayObjectAlias(Event a, Event b) {
    DerivedVariable addressA = addressVariables.get(a);
    DerivedVariable addressB = addressVariables.get(b);
    return addressA == null || addressB == null ||
            !Collections.disjoint(getAccessibleObjects(addressA), getAccessibleObjects(addressB));
}
@Override
public boolean mustObjectAlias(Event a, Event b) {
    DerivedVariable addressA = addressVariables.get(a);
    DerivedVariable addressB = addressVariables.get(b);
    if (addressA == null | addressB == null) {
        return false;
    }
    if (addressA.base == addressB.base) {
        return true;
    }
    Set<MemoryObject> objectsA = getAccessibleObjects(addressA);
    return objectsA.size() == 1 && objectsA.equals(getAccessibleObjects(addressB));
}
private Set<MemoryObject> getAccessibleObjects(DerivedVariable address) {
    var objects = new HashSet<MemoryObject>();
    objects.add(address.base.object);
    for (IncludeEdge edge : address.base.includes) {
        objects.add(edge.source.object);
    }
    objects.remove(null);
    return objects;
}
private void run(Program program, AliasAnalysis.Config configuration) {
    ...
    for (Alloc alloc : program.getThreadEvents(Alloc.class)) {
        addressVariables.put(alloc, derive(objectVariables.get(alloc.getAllocatedObject())));
    }
    ...
}

Implementing it in EqualityAliasAnalysis:

@Override
public boolean mayObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    return true;
}
@Override
public boolean mustObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    return mustAlias(a, b);
}

Implementation for CombinedAliasAnalysis:

@Override
public boolean mayObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    return a1.mayObjectAlias(a, b) && a2.mayObjectAlias(a, b);
}
@Override
public boolean mustObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    return a1.mustObjectAlias(a, b) || a2.mustObjectAlias(a, b);
}

Implementing it in AndersenAliasAnalysis and FieldSensitiveAndersen:

@Override
public boolean mayObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    return !Collections.disjoint(getAccessibleObjects(a), getAccessibleObjects(b));
}
@Override
public boolean mustObjectAlias(MemoryCoreEvent a, MemoryCoreEvent b) {
    Set<MemoryObject> objects = getAccessibleObjects(a);
    return objects.size() == 1 && objects.containsAll(getAccessibleObjects(b));
}
private Set<MemoryObject> getAccessibleObjects(MemoryCoreEvent event) {
    var objects = new HashSet<MemoryObject>();
    for (Location location : getMaxAddressSet(event)) {
        objects.add(location.base);
    }
    return objects;
}

CapZTr · 2025-02-05T14:30:47Z

Are you suggesting that we only care about the base object and don't consider offset at all? Won't this result in a loss of precision for analysis when the size of object is integer?

ThomasHaas · 2025-02-05T14:33:08Z

Not really. You would only get precision in the presence of out-of-bounds accesses (UB), which we don't handle correctly either way.

natgavrilenko · 2025-02-05T18:23:31Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/NativeRelationAnalysis.java

-                    for (Event e2 : program.getThreadEvents(MemoryEvent.class)) {
-                        may.add(e1, e2);
+                    for (MemoryCoreEvent e2 : program.getThreadEvents(MemoryCoreEvent.class)) {
+                        if (alias.mayAlias(e1, e2)) {


We can skip e2 if it is an instance of Init

natgavrilenko · 2025-02-05T18:58:05Z

.../src/main/java/com/dat3m/dartagnan/program/analysis/alias/InclusionBasedPointerAnalysis.java

@@ -202,6 +332,9 @@ private void run(Program program, AliasAnalysis.Config configuration) {
        for (final MemoryCoreEvent memoryEvent : program.getThreadEvents(MemoryCoreEvent.class)) {
            processMemoryEvent(memoryEvent);
        }
+        for (final MemFree free : program.getThreadEvents(MemFree.class)) {


Similar to the other analysis, Alloc -> MemFree should check exact pointer match, not the whole allocated region. So, in order to reuse the same algorithm, you need to allocs to addressVariables.

natgavrilenko · 2025-02-05T19:34:14Z

.../src/main/java/com/dat3m/dartagnan/program/analysis/alias/InclusionBasedPointerAnalysis.java

+                    continue;
+                }
+                final Modifier m = compose(i.modifier, v.modifier);
+                final boolean may = isConstant(m) ? m.offset < size && m.offset >= 0


I would use Rene's suggestion and compare only base objects when checking the whole memory region.

Signed-off-by: Tianrui Zheng <[email protected]>

Tianrui Zheng and others added 12 commits January 9, 2025 10:34

init alias analysis for alloc

805ce89

Signed-off-by: Tianrui Zheng <[email protected]>

add MemFree events to the test

9763bc9

Signed-off-by: Tianrui Zheng <[email protected]>

add first test for alias analysis

43a1e04

Signed-off-by: Tianrui Zheng <[email protected]>

add second test for alias analysis

0e71a8e

Signed-off-by: Tianrui Zheng <[email protected]>

improve field insensitive for alloc

fcbd440

Signed-off-by: Tianrui Zheng <[email protected]>

add third test for alias analysis

bbbce63

Signed-off-by: Tianrui Zheng <[email protected]>

add alias analysis for free events

c90ec8e

Signed-off-by: Tianrui Zheng <[email protected]>

use alias analysis for alloc and free in relation analysis

55850ee

Signed-off-by: Tianrui Zheng <[email protected]>

change native relation analysis

4c84cb6

Signed-off-by: Tianrui Zheng <[email protected]>

change logic of mustAlias

f3acd26

Signed-off-by: Tianrui Zheng <[email protected]>

Using stack pointer in alloc tests

7ff6354

change exception type

8481445

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas reviewed Jan 21, 2025

View reviewed changes

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/alias/AliasAnalysis.java Outdated Show resolved Hide resolved

natgavrilenko reviewed Jan 23, 2025

View reviewed changes

Add full alias analysis

ab9bc56

Signed-off-by: Tianrui Zheng <[email protected]>

natgavrilenko reviewed Jan 31, 2025

View reviewed changes

natgavrilenko reviewed Feb 5, 2025

View reviewed changes

Add object alias for alloc -> memory event

1168dfa

Signed-off-by: Tianrui Zheng <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support alloc-free dev #794

Support alloc-free dev #794

CapZTr commented Jan 21, 2025

ThomasHaas left a comment

ThomasHaas Jan 21, 2025

natgavrilenko commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

natgavrilenko commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

hernanponcedeleon commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

natgavrilenko Jan 23, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

natgavrilenko Jan 31, 2025

xeren commented Feb 5, 2025

CapZTr commented Feb 5, 2025

ThomasHaas commented Feb 5, 2025

natgavrilenko Feb 5, 2025

natgavrilenko Feb 5, 2025

natgavrilenko Feb 5, 2025

		@@ -774,14 +774,24 @@ public MutableKnowledge visitAllocPtr(AllocPtr aref) {
		MutableEventGraph must = new MapEventGraph();

Support alloc-free dev #794

Are you sure you want to change the base?

Support alloc-free dev #794

Conversation

CapZTr commented Jan 21, 2025

ThomasHaas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natgavrilenko commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

natgavrilenko commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

hernanponcedeleon commented Jan 21, 2025

ThomasHaas commented Jan 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xeren commented Feb 5, 2025

CapZTr commented Feb 5, 2025

ThomasHaas commented Feb 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment