Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a floating point lookup table #128

Open
n-west opened this issue Sep 28, 2017 · 3 comments
Open

Implement a floating point lookup table #128

n-west opened this issue Sep 28, 2017 · 3 comments
Labels
Enhancement new kernel entirely or for some specific ARCH Low Low Priority

Comments

@n-west
Copy link
Member

n-west commented Sep 28, 2017

A couple of look up tables would be useful. To keep things simple, start with a floating point -> floating point lookup table. This is a new kernel type that might have the signature volk_32f_x2_s32f_lut_32f(float *output_vector, float *lut, float *input_vector, float max_input, unsigned int num_points).

This would also need a puppet that tests it on something like a cosine function.

AVX2 introduces a gather instruction that might make a LUT vectorizable (__m256 _mm256_i32gather_ps (float const* base_addr, __m256i vindex, const int scale)). It is currently unknown if vectorizing a LUT would have any advantage over a generic LUT.

Disclaimer: this is a fairly hefty project

@marcusmueller
Copy link
Member

what should a float->float LUT do? Return value only if key bitwise identical? if key identical in terms of == (e.g. -0f == 0f)? Nearest neighbor? linear interp?

I played around with the gather intrinsics, and they seem vastly useful and if I understood the throughput correctly, really an advantage, so hey, that does sound interesting, but:

What's the use case that I'd optimize for?

@xloem
Copy link
Contributor

xloem commented Oct 8, 2018

personally, i think a float->float lut might be most effective with nearest neighbor. in many systems, the granularity of the data is known, so linear interpolation would be overkill

@michaelld michaelld added Enhancement new kernel entirely or for some specific ARCH Low Low Priority and removed hacktoberfest labels Nov 14, 2019
@michaelld
Copy link
Contributor

LUTs can be very useful, and the various gather intrinsics can be really useful too for various lookups. It would be interesting to see the speed difference in using a generic kernel versus one using the gather. Let's keep this issue around even if nobody will be getting to it any time soon, as a reminder to us that it would be interesting to investigate some day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement new kernel entirely or for some specific ARCH Low Low Priority
Projects
None yet
Development

No branches or pull requests

4 participants