Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for OpenCL SPIR-V #792

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

tonghaining
Copy link
Contributor

No description provided.

@ThomasHaas
Copy link
Collaborator

Oh, this is a big one, but I immediately have two questions:

(1) What is the purpose of ScopedPointerVariable? It seems to me that in the whole processing pipeline and the final encoding, it is always treated like a simple address/memory object. If so, a processing pass that just eliminates them altogether would be a lot better.

(2) Why does FunctionCall need to remember expressions transformers that are used during Inlining? Is this somehow related to the scoped pointers?

@tonghaining
Copy link
Contributor Author

(1) Thanks for pointing this out! Yes ScopedPointerVariable visitors are not necessary. I think I initially added them while working on accessing function parameters but forgot to remove them afterward. I’ve now removed all of them.

(2) It was not relevant to the scoped pointers. We now have new cases where one test has multiple functions.

@ThomasHaas
Copy link
Collaborator

(2) It was not relevant to the scoped pointers. We now have new cases where one test has multiple functions

I don't understand this point. C-code also has multiple functions, yet we do not need special transformers when inlining.
What is different for SPIRV/OpenCL code? Do you need to translate pointers to thread_local memory objects or something?

@tonghaining
Copy link
Contributor Author

Yes for pointers inside a function call, we sometimes need to translate them into thread-local/workgroup-local memory objects. Does C-code has similar cases?

@ThomasHaas
Copy link
Collaborator

Ok, I guessed so. But then you are facing a different issue.
In C-code, we first do inlining and then create the threads, so the thread creation pass actually resolves/translates thread-local memory accesses.
In your case, I believe the parser already creates threads during parsing time, i.e., before inlining.

I think there are two canonical solutions to this:
(1) Do not create threads during parsing time. Instead, we could mark a function as "entry point" (in C-code this is just main, for litmus-like code it would be each process) and let the thread creation pass generate the threads. In this case you would actually need to attach GPU-specific scope information already to functions instead of threads.
(2) Let Inlining do its usual job and have an extra pass that resolves/translates thread-local accesses if possible. This translation should not make use of custom expression transformers attached to function calls. If anything, I think the necessary information should be attached to the threads (or even better: be computable from the threads).

Copy link
Collaborator

@natgavrilenko natgavrilenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First part of the review, will continue tomorrow.

Copy link
Owner

@hernanponcedeleon hernanponcedeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw Natalia stated her review too, so some comments might be repeated

public List<Event> visitSpirvLoad(SpirvLoad e) {
String mo = moToOpenCLTag(Tag.Spirv.getMoTag(e.getTags()));
Load load = newLoadWithMo(e.getResultRegister(), e.getAddress(), mo);
load.setFunction(e.getFunction());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do spirv events need to propagate the function from the original event?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely be avoided because it causes inconsistent state: the event thinks it belongs to a function, but it is not inserted into the function yet. Basically this function should never be called directly (maybe we can make it private, not sure though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For single event they are not useful, but for RMW events which are translated to C11 event sequences, functions are needed for creating dummy registers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VisitorSpirvVulkan is translated to Vulkan event sequences and those also need to create dummy registers for RMW. Why is this not a problem in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override
public List<Event> visitControlBarrier(ControlBarrier e) {
Event entryFence = EventFactory.newControlBarrier(e.getName() + "_entry", e.getId());
entryFence.addTags(Tag.OpenCL.ENTRY_FENCE, Tag.C11.MO_RELEASE);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed about getting rid of this fence tags and treat barriers in OpenCL the same we do for the other languages (IIRC this was only needed for barrier divergence; we can figure out later how to handle that property)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Hernan that we should treat control barriers in the same way for all models, unless there is a really good reason to do otherwise. But I suggest to merge this commit first and change barriers after that in a separate commit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, the Tag.OpenCL.ENTRY_FENCE tag is not used anymore, so we can simply remove it from the Tag class and remove the axiom from the cat file (all this is dead code ATM)

@tonghaining tonghaining force-pushed the opencl branch 2 times, most recently from 55e9306 to f813b99 Compare January 27, 2025 10:48
@tonghaining tonghaining force-pushed the opencl branch 2 times, most recently from adec207 to cd3219c Compare January 30, 2025 16:54
@tonghaining tonghaining force-pushed the opencl branch 2 times, most recently from 5ddbf02 to 6b10bf4 Compare February 12, 2025 10:28
@ThomasHaas
Copy link
Collaborator

[...] there will be a weird loop with two backward jumps, and I wonder how it is unrolled.

We normalize such loops first:

  • Only the last backward jump remains (if it is unconditional, otherwise we introduce a new one).
  • Every other backward jump is transformed to a forward jump that jumps to the last backward jump.

@hernanponcedeleon
Copy link
Owner

@tonghaining please rebase and fix the conflicts

@tonghaining tonghaining force-pushed the opencl branch 2 times, most recently from 3163cb4 to 2c93434 Compare February 19, 2025 09:41
@tonghaining tonghaining force-pushed the opencl branch 2 times, most recently from 6e0eb25 to 232d7d5 Compare February 20, 2025 10:49
Copy link
Owner

@hernanponcedeleon hernanponcedeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass up to VisitorExtensionOpenClStd

Comment on lines 38 to 40
if (!(element instanceof ConstructExpr)) {
throw new ParsingException(String.format("Element is not a ConstructExpr at index: %d for: %s", index, id));
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same check as line 29?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, line 29 removed.

@Override
public List<Event> visitControlBarrier(ControlBarrier e) {
Event entryFence = EventFactory.newControlBarrier(e.getName() + "_entry", e.getId());
entryFence.addTags(Tag.OpenCL.ENTRY_FENCE, Tag.C11.MO_RELEASE);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, the Tag.OpenCL.ENTRY_FENCE tag is not used anymore, so we can simply remove it from the Tag class and remove the axiom from the cat file (all this is dead code ATM)

Comment on lines 32 to 33
final Map<String, Function> forwardFunctions = new HashMap<>();
final Map<String, Set<FunctionCall>> forwardCalls = new HashMap<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are those 2 not private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to private.

Comment on lines 105 to 108
if (type instanceof ArrayType aType) {
Type elementType = HelperTypes.getMemberType(id, aType, List.of(0));
return 1 + getFirstElementDepth(id, elementType);
} else if (type instanceof AggregateType agType) {
Type elementType = HelperTypes.getMemberType(id, agType, List.of(0));
return 1 + getFirstElementDepth(id, elementType);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge the two bodies or would the type checker complain about getMemberType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged

return 0;
}

private void createExternalVariable(String id, Type type) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot really follow what is going on with createExternalVariable, addExternalValue, createPointedExternalValue, addExternalParameterLocalEvent and addParameterRegisters. Why do we need all this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are used to create external variables which entry function pointers may point to (which are not declared by OpVariable as our earlier tests).
I update their name as:

createExternalVariable -> createParameterVariable
addExternalValue -> addParameterVariable
createPointedExternalValue -> createExternalVariable
addExternalParameterLocalEvent -> addEntryParameterLocalEvent

Would it be better as this?

@@ -191,8 +210,9 @@ public String getPointerStorageClass(String id) {

public Register addRegister(String id, String typeId) {
Type type = getType(typeId);
if (type instanceof ScopedPointerType) {
throw new ParsingException("Register cannot be a pointer");
// TODO: Remove this check when tuple registers are supported

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tuple registers are supported, it is just that we do not create single memory events for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment updated

Comment on lines 152 to 175
public void updateEntry(Event event){
if (entry == null) {
append(event);
} else {
Event originalEntry = entry;
entry = event;
entry.setFunction(this);
event.setSuccessor(originalEntry);
originalEntry.setPredecessor(event);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be symmetric to updateExit() as it is the case in #803

Comment on lines 201 to 205
if (loopBegin.getPredecessor() == null) {
Event newEntry = copies.get(0);
loopBegin.getFunction().updateEntry(newEntry);
newEntry.insertAfter(copies.subList(1, copies.size()));
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this special handling?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the lack of insertBefore I think. I think its best to merge #803 and use the new functions rather than working around the limitations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it causes issue when loop from the first event.

final Thread thread = createSPVThreadFromFunction(entryFunction, tid, grid, transformers);
program.addThread(thread);
}
// Remove unused memory objects of the entry function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "unused" you mean that a given object X was repladed by e.g., WX and from now one the program will use WX?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given object X in the entry function, it will be replaceds by X0 in thread0, X1 in thread1.., and the entry function objects will not be used anymore.

import com.dat3m.dartagnan.program.Function;
import com.dat3m.dartagnan.program.Program;
import com.dat3m.dartagnan.program.Register;
import com.dat3m.dartagnan.program.processing.transofrmers.MemoryTransformer;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in transformer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! updated

public List<Event> visitSpirvLoad(SpirvLoad e) {
String mo = moToOpenCLTag(Tag.Spirv.getMoTag(e.getTags()));
Load load = newLoadWithMo(e.getResultRegister(), e.getAddress(), mo);
load.setFunction(e.getFunction());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VisitorSpirvVulkan is translated to Vulkan event sequences and those also need to create dummy registers for RMW. Why is this not a problem in that case?

Signed-off-by: Haining Tong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants