Skip to content
This repository has been archived by the owner on Aug 16, 2024. It is now read-only.

L2-friendly chunking and twiddle persistence for batched NTTs and batched NTT+bitreverse sequences #31

Merged
merged 11 commits into from
Feb 13, 2024

Conversation

mcarilli
Copy link
Collaborator

@mcarilli mcarilli commented Feb 7, 2024

What ❔

Batched NTT (+bitrev) operations launch a sequence of several kernels. This PR splits batches into chunks small enough to persist in the L2 cache across kernel launches, ie, we do the whole NTT (+bitrev) sequence for the first chunk, then the second chunk, and so on.

Why ❔

Leveraging L2 persistence this way reduces gmem traffic and improves performance.

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted via zk fmt and zk lint.

@mcarilli mcarilli requested a review from robik75 February 9, 2024 16:36
robik75 pushed a commit to matter-labs/era-boojum-cuda that referenced this pull request Feb 13, 2024
Required by matter-labs/era-shivini#31

## Checklist

- [x] PR title corresponds to the body of PR (we generate changelog
entries from PRs).
- [x] Tests for the changes have been added / updated.
- [x] Documentation comments have been added / updated.
- [x] Code has been formatted via `cargo fmt` and `cargo lint`.
@mcarilli mcarilli changed the title [WIP] L2-friendly chunking and twiddle persistence for batched NTTs and batched NTT+bitreverse sequences L2-friendly chunking and twiddle persistence for batched NTTs and batched NTT+bitreverse sequences Feb 13, 2024
@mcarilli mcarilli merged commit f25a855 into main Feb 13, 2024
4 checks passed
@robik75 robik75 deleted the mc-ntt-persistence branch August 6, 2024 12:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants