-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Neonv8 protokernel to 'volk_32f_64f_add_64f' #282
Conversation
Adds proto-kernel with NEONv8 support. Code in toolchains credited to Albin Stigo (@ast). Also special thanks in this commit to evryone that participated in
Hello friends,
Thanks in advance, |
@dmiralles2009 could you test this with
Since this is a converter from float to int and the error says |
Hi @jdemel, that is curious because my PR does not modify that portion of the code. I also did a dry run at home and it seems to build without failing in my machine. I might look over the failing QA in the next few days. Thanks for the response :) |
|
@@ -4,5 +4,6 @@ | |||
######################################################################## | |||
set(CMAKE_CXX_COMPILER g++) | |||
set(CMAKE_C_COMPILER gcc) | |||
set(CMAKE_CXX_FLAGS "-march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8 -mfloat-abi=hard" CACHE STRING "" FORCE) | |||
set(CMAKE_CXX_FLAGS "-ffast-math -march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8 -mfloat-abi=hard" CACHE STRING "" FORCE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recently we had a discussion on -ffast-math
and my understanding is that we should not use it because it breaks things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if I remember correctly we agreed that it might break things because it allows non IEEE-754 compliant optimizations, and is therefore not a good default.
On the other hand NEONv7 is also not strictly IEEE-754 compliant and all tests passed for me with fast-math enabled.
But for now I think it's better to leave it out!
Any further updates to this PR, or do y'all think it is ready for review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave out the fast math flag for this specific PR. Otherwise, looks OK to me.
Hi all, I have been busy with work but will look at this closely during the weekend. Thanks for the replies :) |
Updating requests from PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now
CI checks pass. Multiple positive reviews. Merging. |
Adds Adds Neonv8 protokernel to volk_32f_64f_add_64f. Code in toolchains credited to Albin
Stigo (@ast). Execution improvement shown below