Integrating GPU based Vector Search using cuVS #14131

chatman · 2025-01-10T15:27:51Z

Description

NVIDIA's cuVS library (https://github.com/rapidsai/cuvs) is a state-of-the-art vector search library. It supports fast indexing and search using GPUs.
This pull request is to integrate this library based on a custom KnnVectorFormat and Codec into the Lucene sandbox.
cuVS library is a C library. There is an in-progress Java API (to be released soon): Initial cut for a cuVS Java API rapidsai/cuvs#450. This uses Project Panama for integration.

This is an in-progress PR at the moment. Here is a way to test it out:

Clone the cuvs repository from the PR branch.
./build.sh libcuvs && ./build.sh java
(The above will install the cuvs-java artifacts in local Maven repository)
Compile and use this branch in an IDE.

TODO:

TestCuVS works via IDE, but not via gradle (some native access security issues).
Make this branch work with released version of cuvs-java, once it is released.
Add more tests.
Publish benchmarks.

This work is mainly done by @narangvivek10, @punAhuja and me, along with help from @cjnolet.

chatman · 2025-01-10T15:31:30Z

FYI @uschindler, @ChrisHegarty, @dsmiley, @msokolov

dweiss · 2025-01-10T21:12:51Z

build-tools/build-infra/build.gradle

@@ -22,6 +22,7 @@ plugins {
 }

 repositories {
+  mavenLocal()


Remove mavenLocal before merging, if it happens. There will be issues with it - some are very cryptic and hard to diagnose (like different artifact hashes). It's going to be a major headache if it's left in.

navneet1v · 2025-01-10T23:27:12Z

@chatman thanks for creating the PR. This looks very interesting. is the idea here is the Lucene library will on a GPU machine and running the CUVS.

navneet1v · 2025-01-10T23:35:21Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/vectorsearch/CuVSVectorsWriter.java

+            .withNumWriterThreads(cuvsWriterThreads)
+            .withIntermediateGraphDegree(intGraphDegree)
+            .withGraphDegree(graphDegree)
+            .withCagraGraphBuildAlgo(CagraGraphBuildAlgo.NN_DESCENT)


I have some experience with building the Cagra index, and I think NN_DESCENT is faster in cagra index creation but it has a high GPU memory footprint. Should we use IVF_PQ here? Or can we have a hybrid approach where if doc count is < a specific number then we use NN_DESCENT else IVF_PQ.

Ishan Chattopadhyaya and others added 4 commits January 7, 2025 21:17

Initial cut of CuVS into Lucene as a Codec in sandbox

b8a1162

Test fixes

0e9f6d4

fix for getFloatVectorValues

a95f084

Fixing precommit, ECJ, Rat, spotless, forbiddenApis etc.

9f0d3dd

dweiss reviewed Jan 10, 2025

View reviewed changes

navneet1v reviewed Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating GPU based Vector Search using cuVS #14131

Integrating GPU based Vector Search using cuVS #14131

chatman commented Jan 10, 2025

chatman commented Jan 10, 2025

dweiss Jan 10, 2025

navneet1v commented Jan 10, 2025

navneet1v Jan 10, 2025

Integrating GPU based Vector Search using cuVS #14131

Are you sure you want to change the base?

Integrating GPU based Vector Search using cuVS #14131

Conversation

chatman commented Jan 10, 2025

Description

chatman commented Jan 10, 2025

dweiss Jan 10, 2025

Choose a reason for hiding this comment

navneet1v commented Jan 10, 2025

navneet1v Jan 10, 2025

Choose a reason for hiding this comment