You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From a quick look, Auryn doesn't seem to be doing anything particularly different to Brian except that they are explicitly managing SIMD instructions, e.g. in this file. I thought that Brian should also be using them the way we had set things up, but perhaps not as efficiently as I think?
How big are the differences we are currently talking about? Has there been a recent benchmark comparison? Since I have little time to improve Auryn these days, most of its performance improvements are due to updates to the GNU C++ compiler.
I think the differences are not huge (not orders of magnitude) but if it would be easy for us to just throw in a few SIMD instructions here and there or make sure a few arrays are allocated in a properly aligned way, it might be that we could get a speed boost at relatively low effort.
That was my impression, but someone sent me some (unpublished) numebrs where they carefully controlled compiler version etc. and Auryn was still quite a bit faster than Brian for at least some models. Having had a look at the code, there doesn't seem to be anything super clever you're doing different to Brian, other than the SIMD stuff, so I'm a bit baffled as to why the speed difference!
From a quick look, Auryn doesn't seem to be doing anything particularly different to Brian except that they are explicitly managing SIMD instructions, e.g. in this file. I thought that Brian should also be using them the way we had set things up, but perhaps not as efficiently as I think?
@fzenke any comment on this?
The text was updated successfully, but these errors were encountered: