Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Indexing for RabitQ #84

Merged
merged 22 commits into from
Dec 1, 2024
Merged

Conversation

tvhong
Copy link
Contributor

@tvhong tvhong commented Nov 9, 2024

Notes

Implement the Indexing Phase for RabitQ.

Paper: https://arxiv.org/abs/2405.12497
My summary of the indexing phase: #32 (comment)

Please note that the current implementation is very inefficient since I don't have much experience writing high performance code, but it should at least be correct.

@tvhong tvhong changed the title [DRAFT] Issue using ndarray-rand [DRAFT] Issue using ndarray-linalg Nov 9, 2024
@tvhong tvhong force-pushed the vhong/index_rabitq branch 2 times, most recently from 991241c to aaee3b7 Compare November 10, 2024 20:19
@tvhong tvhong changed the title [DRAFT] Issue using ndarray-linalg Implement Indexing for RabitQ Nov 10, 2024
@tvhong tvhong force-pushed the vhong/index_rabitq branch 3 times, most recently from 212554d to 8920546 Compare November 10, 2024 21:57
@tvhong tvhong marked this pull request as ready for review November 10, 2024 22:00
Copy link
Collaborator

@tyb0807 tyb0807 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one high level question, why Array1 and Array2 instead of Vec and Vec<Vec>? Asking just for the sake of consistency with other parts of the code base.

@tvhong
Copy link
Contributor Author

tvhong commented Nov 10, 2024

@tyb0807 The querying phase needs to do some matrix multiplication and vector dot products (see #32 (comment)). So, I store these as ndarray objects.

That being said, I'm not set on the data type at the moment. Maybe ndarray doesn't support serialization well, or that we should use nalgebra since we're only dealing with low-dimensionality data. But, regardless, it's quite easy to change the data type once we decide that ndarray isn't the right type.

Copy link
Owner

@hicder hicder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to use the normalized vector

rs/quantization/src/rabitq.rs Outdated Show resolved Hide resolved
@tvhong
Copy link
Contributor Author

tvhong commented Dec 1, 2024

Updated the code to use normalized vectors instead.

@tvhong tvhong force-pushed the vhong/index_rabitq branch from cb570b7 to 4b692dc Compare December 1, 2024 06:45
Copy link
Owner

@hicder hicder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hicder hicder merged commit 290fafa into hicder:master Dec 1, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants