-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the interface for streaming the vectors from java to jni layer with initial capacity #1586
Add the interface for streaming the vectors from java to jni layer with initial capacity #1586
Conversation
fd192ea
to
e980b6f
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/stream-vectors #1586 +/- ##
============================================================
- Coverage 84.92% 84.84% -0.08%
+ Complexity 1375 1374 -1
============================================================
Files 172 173 +1
Lines 5605 5609 +4
Branches 553 553
============================================================
- Hits 4760 4759 -1
- Misses 612 616 +4
- Partials 233 234 +1 ☔ View full report in Codecov by Sentry. |
c4f4b63
to
32027fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Minor comment, but not blocker
32027fe
to
3368a67
Compare
One other thing @navneet1v - I think we might need to update https://github.com/opensearch-project/k-NN/blob/main/src/main/plugin-metadata/plugin-security.policy#L2-L4 with new jni lib. |
if this change is needed how does it work today? Looks like Navneet done many benchmarks with current code. |
the micro-benchmark code doesn't require that permission. I have not tested the code with the k-NN plugin. Which I would have done in the upcoming PRs. |
…th initial capacity Signed-off-by: Navneet Verma <[email protected]>
3368a67
to
0a5b008
Compare
updated the PR. |
@jmazanec15 and @martin-gaievski updated the PR. |
@martin-gaievski @navneet1v I have a suspicion that faiss is being called first which is then loading the common library in the C++ code - but not sure either how it was working. |
I was not loading Faiss for benchmarks. As the benchmark is just a simple java process it doesn't those permission(as per my understanding). But given that JNICommons will be used before we even call Faiss hence it make sense to add the permission. We would have caught this in next iteration of the PR when these interfaces will be actually added in the k-NN plugin create index call. But as of not as that integration was not added adding permission could have been missed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, few things that left will be addressed in coming PRs
fccc5a9
into
opensearch-project:feature/stream-vectors
…layer to enable creation of larger segments for vector indices Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices (#1604) Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices (opensearch-project#1604) Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices (#1604) (#1608) Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (#1602) Signed-off-by: Navneet Verma <[email protected]>
…layer to enable creation of larger segments for vector indices (opensearch-project#1604) (opensearch-project#1608) Changes include: 1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586) 2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588) 3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595) 4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602) Signed-off-by: Navneet Verma <[email protected]>
Description
Add the interface for streaming the vectors from java to jni layer with initial capacity. The PR will be added in a feature branch.
Along with this change I deprecated the methods in FaissService. Will remove those methods going forward. I also removed the transferVectorsV2 function as it is not required now.
JNI test
Old benchmarks
New Benchmarks
Issues Resolved
This is first of the many PR that will be raised for this issue: #1506
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.