Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MulEven performance for RVV #2334

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lsrcz
Copy link
Contributor

@lsrcz lsrcz commented Sep 27, 2024

This pull request improves the performance of MulEven by using masked multiplication instead of merging the results after computing the higher and lower parts.

A similar optimization can be applied to the MulOdd operator, though this would require a MaskedMulHighOr operator, which is currently unavailable.

For 8-bit elements, MulEven currently uses the immediate value 0x5555, which requires two instructions to construct on RISC-V. There may be a potential optimization by constructing the mask manually instead of relying on Dup128MaskFromMaskBits, allowing the use of the smaller immediate 0x55. However, it's just a scalar instruction, so I am unsure if we want to do so.

Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks like we're fusing OddEven into the computation. Thanks for sending the PR!
Is this also automatically found?

I would expect scalar instructions are fine, the scalar pipes might be running ahead and/or idle.

@lsrcz
Copy link
Contributor Author

lsrcz commented Sep 30, 2024

Yes, a synthesizer automatically finds this. I have no idea why the CI fails though -- it seems that it doesn't come from this pull request.

For MulOdd, do you think adding the masked MulHigh and do the optimization is worthwhile?

@jan-wassenberg
Copy link
Member

Nice. I agree CI failures are unrelated and will fix some of that shortly.

I think MulOdd would be rarely used, so let's leave it as-is for now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants