Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streamk atomics fix #632

Merged
merged 2 commits into from
Aug 19, 2024
Merged

streamk atomics fix #632

merged 2 commits into from
Aug 19, 2024

Conversation

xiaohuguo2023
Copy link
Member

@xiaohuguo2023 xiaohuguo2023 commented Aug 19, 2024

streamk gemm kernel is using spinning lock to implement multiple buffer method to replace atomic_add,

The PR 4431 cause data racing when using atomics_xchg and atomics_cas together to implement [spinning lock.] atomic cas uses shared memory but atomics_xchg doesn't.(https://github.com/ROCm/triton/blob/624335ff569562d5db26bea337e3c6de2bd6b0dc/python/perf-kernels/streamk/streamk_kernel.py#L173C12-L205C1)

In Triton, atomic operations are performed at the block level, where each block can consist of multiple waves. The purpose of adding synchronization is to ensure that waves wait until the current wave has completed its execution.

@xiaohuguo2023 xiaohuguo2023 merged commit 177d0bd into main_perf Aug 19, 2024
4 checks passed
@xiaohuguo2023 xiaohuguo2023 deleted the atomics_fix branch August 19, 2024 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants