-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NATIVE_FP16 macro in our fork of half.h #7449
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7449
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 8 PendingAs of commit fc44779 with merge base 39e8538 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ping |
31cc29b
to
e4fbfbf
Compare
runtime/core/portable_type/half.h
Outdated
@@ -15,7 +15,7 @@ | |||
#include <ostream> | |||
|
|||
#if defined(__GNUC__) || defined(__clang__) | |||
#if defined(__aarch64__) | |||
#if defined(__aarch64__) && defined(__ARM_FEATURE_FP16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the correct definition actually?
I can see __ARM_FEATURE_FP16_SCALAR_ARITHMETIC, __ARM_FEATURE_FP16_FML and __ARM_FEATURE_FP16_VECTOR_ARITHMETIC in https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros
But I don't see a standalone __ARM_FEATURE_FP16 macros in the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's actually a typo, it would be good to have a regression test somewhere to catch these types of errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regression test
I guess we could disassemble half_test.o and make sure it contains the relevant aarch64 instructions. Should probably just do that in upstream PyTorch though; after #7040 is finally ready to go, this file will be powered by c10/util/Half.h anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably wanted __ARM_FEATURE_FP16_SCALAR_ARITHMETIC
, for the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I manually disassembled half_test.cpp.o and confirmed that 1) it wasn't the same as baseline before I fixed the defined check to __ARM_FEATURE_FP16_SCALAR_ARITHMETIC
2) it is now the same as baseline
Simple build unbreak for Raspberry Pi 5 -- fp16 is an optional aarch64 feature.
e4fbfbf
to
fc44779
Compare
Simple build unbreak for Raspberry Pi 5 -- fp16 is an optional aarch64 feature.