Add optional AVX512-FP16 arithmetic for the scalar quantizer. #4225
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR #4025 introduced a new architecture mode,
avx512_spr
, which enables the use of features available since Intel® Sapphire Rapids. The Hamming Distance Optimization (PR #4020), based on this mode, is now used by OpenSearch to speed up the indexing and searching of binary vectors.This PR adds support for
AVX512-FP16
arithmetic for the Scalar Quantizer. It introduces a new Boolean flag,ENABLE_AVX512_FP16
, which, when used together with theavx512_spr
mode, explicitly enablesavx512fp16
arithmetic.Tests on an AWS r7i instance demonstrate up to a 1.6x speedup in execution time when using
AVX512-FP16
compared toAVX512
. The improvement comes from a reduction in path length.-DFAISS_OPT_LEVEL=avx512
:-DENABLE_AVX512_FP16 -DFAISS_OPT_LEVEL=avx512_spr