Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating GPU based Vector Search using cuVS #14131

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

chatman
Copy link
Contributor

@chatman chatman commented Jan 10, 2025

Description

  • NVIDIA's cuVS library (https://github.com/rapidsai/cuvs) is a state-of-the-art vector search library. It supports fast indexing and search using GPUs.
  • This pull request is to integrate this library based on a custom KnnVectorFormat and Codec into the Lucene sandbox.
  • cuVS library is a C library. There is an in-progress Java API (to be released soon): Initial cut for a cuVS Java API rapidsai/cuvs#450. This uses Project Panama for integration.

This is an in-progress PR at the moment. Here is a way to test it out:

  • Clone the cuvs repository from the PR branch.
  • ./build.sh libcuvs && ./build.sh java
  • (The above will install the cuvs-java artifacts in local Maven repository)
  • Compile and use this branch in an IDE.

TODO:

  • TestCuVS works via IDE, but not via gradle (some native access security issues).
  • Make this branch work with released version of cuvs-java, once it is released.
  • Add more tests.
  • Publish benchmarks.

This work is mainly done by @narangvivek10, @punAhuja and me, along with help from @cjnolet.

@chatman
Copy link
Contributor Author

chatman commented Jan 10, 2025

@@ -22,6 +22,7 @@ plugins {
}

repositories {
mavenLocal()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove mavenLocal before merging, if it happens. There will be issues with it - some are very cryptic and hard to diagnose (like different artifact hashes). It's going to be a major headache if it's left in.

@navneet1v
Copy link
Contributor

@chatman thanks for creating the PR. This looks very interesting. is the idea here is the Lucene library will on a GPU machine and running the CUVS.

.withNumWriterThreads(cuvsWriterThreads)
.withIntermediateGraphDegree(intGraphDegree)
.withGraphDegree(graphDegree)
.withCagraGraphBuildAlgo(CagraGraphBuildAlgo.NN_DESCENT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some experience with building the Cagra index, and I think NN_DESCENT is faster in cagra index creation but it has a high GPU memory footprint. Should we use IVF_PQ here? Or can we have a hybrid approach where if doc count is < a specific number then we use NN_DESCENT else IVF_PQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants