Implement a floating point lookup table #128

n-west · 2017-09-28T19:28:21Z

A couple of look up tables would be useful. To keep things simple, start with a floating point -> floating point lookup table. This is a new kernel type that might have the signature volk_32f_x2_s32f_lut_32f(float *output_vector, float *lut, float *input_vector, float max_input, unsigned int num_points).

This would also need a puppet that tests it on something like a cosine function.

AVX2 introduces a gather instruction that might make a LUT vectorizable (__m256 _mm256_i32gather_ps (float const* base_addr, __m256i vindex, const int scale)). It is currently unknown if vectorizing a LUT would have any advantage over a generic LUT.

Disclaimer: this is a fairly hefty project

The text was updated successfully, but these errors were encountered:

marcusmueller · 2017-10-26T16:55:50Z

what should a float->float LUT do? Return value only if key bitwise identical? if key identical in terms of == (e.g. -0f == 0f)? Nearest neighbor? linear interp?

I played around with the gather intrinsics, and they seem vastly useful and if I understood the throughput correctly, really an advantage, so hey, that does sound interesting, but:

What's the use case that I'd optimize for?

xloem · 2018-10-08T15:50:58Z

personally, i think a float->float lut might be most effective with nearest neighbor. in many systems, the granularity of the data is known, so linear interpolation would be overkill

michaelld · 2019-11-14T01:22:15Z

LUTs can be very useful, and the various gather intrinsics can be really useful too for various lookups. It would be interesting to see the speed difference in using a generic kernel versus one using the gather. Let's keep this issue around even if nobody will be getting to it any time soon, as a reminder to us that it would be interesting to investigate some day.

n-west added the hacktoberfest label Sep 28, 2017

ast mentioned this issue Dec 26, 2018

volk_16i_max_star_horizontal_16i_a_neonasm.s #222

Closed

michaelld added Enhancement new kernel entirely or for some specific ARCH Low Low Priority and removed hacktoberfest labels Nov 14, 2019

mgarrett1955 mentioned this issue Jan 15, 2021

volk cross build on Xilinx e3xx fails due to incorrect processor flags #436

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a floating point lookup table #128

Implement a floating point lookup table #128

n-west commented Sep 28, 2017

marcusmueller commented Oct 26, 2017

xloem commented Oct 8, 2018

michaelld commented Nov 14, 2019

Implement a floating point lookup table #128

Implement a floating point lookup table #128

Comments

n-west commented Sep 28, 2017

marcusmueller commented Oct 26, 2017

xloem commented Oct 8, 2018

michaelld commented Nov 14, 2019