You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem: pyramid aims to solve the vector retrieval problem with paths.
Example: Each vector is associated with paths such as "a/b/c", "a/b/c/d", "a/b", "a/b/c/e". When given a query path "a/b/c", it retrieves the nearest vector under "a/b/c/*" to the target vector.
Interface Definition
Users need to pass the vector paths through the following interface. Other behaviors of pyramid are defined similarly to other indices.
// Pass the corresponding path to the index
virtual DatasetPtr Paths(const std::string* paths) = 0;
Parameter Design
Build Parameters
{
"pyramid": {
"sub_index_type": "index_name", // Specify a sub-index
"index_param": {
………… // Parameters corresponding to the sub-index
}
}
}
Search Parameters
{
"pyramid": {
………… // Search parameters corresponding to the sub-index
}
}
Implementation Principle
Based on the path of the vector, we can build an index tree, where each node in the index tree contains the corresponding data volume.
For each node in the index tree, we construct the corresponding index.
The index can share the underlying data storage, and the redundancy of the index equals the average number of levels.
Usage Demo
Create Index
auto pyramid_build_parameters = R"(
{
"dtype": "float32",
"metric_type": "l2",
"dim": 128,
"index_param": {
"sub_index_type": "hnsw",
"index_param": {
"max_degree": 16,
"ef_construction": 100
}
}
}
)";
auto index = vsag::Factory::CreateIndex("pyramid", pyramid_build_parameters).value();
auto base = vsag::Dataset::Make();
base->NumElements(num_vectors)->Dim(dim)->Ids(ids)->Float32Vectors(vectors)->Paths(paths)->Owner(false);
index->Build(base);
Prepare a Query Vector
// Memory will be released by querying the dataset since owner is set to true when creating the query.
auto query_vector = new float[dim];
auto query_path = new std::string[1];
for (int64_t i = 0; i < dim; ++i) {
query_vector[i] = distrib_real(rng);
}
query_path[0] = ......; // generate a random path for query vector.
Search on the Index
auto pyramid_search_parameters = R"(
{
"pyramid": {
"ef_search": 100
}
}
)";
int64_t topk = 10;
auto query = vsag::Dataset::Make();
query->NumElements(1)->Dim(dim)->Float32Vectors(query_vector)->Paths(query_path)->Owner(true);
auto result = index->KnnSearch(query, topk, pyramid_search_parameters).value();
wxyucs
changed the title
define the "pyramid" index to support vector search with paths
introduce a new index pyramid (for path-like prefilter search)
Jan 13, 2025
Background
pyramid
aims to solve the vector retrieval problem with paths.Interface Definition
Users need to pass the vector paths through the following interface. Other behaviors of
pyramid
are defined similarly to other indices.Parameter Design
Build Parameters
Search Parameters
Implementation Principle
Usage Demo
Create Index
Prepare a Query Vector
Search on the Index
Implementation Detail
The work will be finished by following PRs:
The text was updated successfully, but these errors were encountered: