Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ matrix_transpose ] Divide f16 transpose and f32 transpose with NEON #2898

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

skykongkong8
Copy link
Member

  1. no NEON -> matrix_transpose_fallback
  2. NEON, but without f16 -> matrix_transpose_neon (f32 simd)
  3. NEON, with f16 -> 2 + matrix_transpose_neon_f16 (f16 simd)

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

We can run arm natively with github-action.
Test fp16 natively on arm machines!

Signed-off-by: MyungJoo Ham <[email protected]>
@skykongkong8 skykongkong8 force-pushed the pr/transpose/dividef16f32 branch from de6101d to 16655ca Compare January 23, 2025 08:12
1. no NEON -> matrix_transpose_fallback
2. NEON, but without f16 -> matrix_transpose_neon
3. NEON, with f16 -> 2 + matrix_transpose_neon_f16

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
@skykongkong8 skykongkong8 force-pushed the pr/transpose/dividef16f32 branch from 16655ca to d47d6b4 Compare January 24, 2025 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants