Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support vectordb benchmark using SIFT1M dataset #2

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

UnpureRationalist
Copy link

Hi, I add a vectordb benchmark script using SIFT1M dataset. The dataset is available at http://corpus-texmex.irisa.fr/. Download it and unzip at path build/sift1M/*..fvecs, compile the script with command make vectordb-bench and run it with command ./bin/bustub-vectordb-bench, then we can get result similar to:

$ ./bin/bustub-vectordb-bench
[0.002 s] Creating vector index...
[0.002 s] Loading database
[0.286 s] Loading database, size 1000000*128
[0.286 s] Loading database, #0  #1000000
[0.500 s] Loading database, #1000  #1000000
[0.946 s] Loading database, #2000  #1000000
[1.504 s] Loading database, #3000  #1000000
[2.387 s] Loading database, #4000  #1000000
[3.333 s] Loading database, #5000  #1000000
[4.131 s] Loading database, #6000  #1000000
...
[16.443 s] Doing query, #6000  #10000
[17.930 s] Doing query, #7000  #10000
[19.434 s] Doing query, #8000  #10000
[21.172 s] Doing query, #9000  #10000
[22.887 s] Compute recalls
R@1 = 0.0143
R@10 = 0.0143
R@100 = 0.0143

@skyzh
Copy link
Owner

skyzh commented Oct 8, 2024

Thanks a lot and this is a huge improvement to the course! I’ll review once I’m back from my vacation 😍

@UnpureRationalist
Copy link
Author

Thank you! BTW, running this script may cause memory errors, which is mentioned in cmu-db#716 and we need to modify the source code to fix this bug. Enjoy your vacation!🌴🏖️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants