Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise perfect hash to align with libgrape-lite's pthash #1851

Closed
wants to merge 19 commits into from

Conversation

vegetableysm
Copy link
Collaborator

@vegetableysm vegetableysm commented Mar 29, 2024

What do these changes do?

OLD:
Modifications of libgrape-lite perfect hash

  • remove inline int8_t log2(size_t value) function from hashmap_indexer_impl.h(Not used)
  • remove namespace sync_comm and struct CommImpl from hashmap_indexer_impl.h(Not used)
  • remove Allocator<T> of std::vector<T, Allocator<T>> inner_ in hashmap_indexer_impl.h(No need)
  • remove encode_vec encode_val and decode_val API from ref_vector.h (Not used)
  • add struct murmurhasher to single_phf_view.h (reduce dependency)
  • add struct external_mem_dumper to single_phf_view.h for dump data to blob (Reduce memcpy)
  • remove std::set<size_t> idx in static void build(Iterator keys, uint64_t n, Dumper& dumper, int thread_num) of struct SinglePHFView in single_phf_view.h (No need)
  • add static void build(Iterator keys, uint64_t n, pthash::single_phf<murmurhasher, pthash::dictionary_dictionary, true>& phf, int thread_num) API for struct SinglePHFView in single_phf_view.h (Reduce memcpy)
  • remove void serialize(std::unique_ptr<IOADAPTOR_T>& writer) void deserialize(std::unique_ptr<IOADAPTOR_T>& reader) and void serialize_to_mem(std::vector<char>& buf) from class StringViewVector in string_view_vector.h (Not used)
  • remove void Serialize(const std::string& path) and void Deserialize(const std::string& path) from class ImmPHIdxer in perfect_hash_indexer.h (Not used)
  • overwrite finish() API of class PHIdxerViewBuilder in perfect_hash_indexer.h (For vineyard)
  • replace std::vector<char> buffer_ with std::shared_ptr<Blob> buffer_ in perfect_hash_indexer.h (For vineyard)
  • replace nonstd::string_view with arrow_string_view in all files. (For vineyard)

NEW:

  • Use libgrapelite pthash
  • Remove pthash / BBHash from vineyard

Related issue number

Fixes #1852

Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
@vegetableysm vegetableysm changed the title Grape perfect hash [WIP]Grape perfect hash Mar 29, 2024
@sighingnow sighingnow changed the title [WIP]Grape perfect hash [WIP]Revise perfect hash to align with libgrape-lite's pthash Apr 1, 2024
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please list the changes you have made to the libgrape-lite's implementation in PR's description?

e.g., replace std::vector<xxx> as xxx*.

Signed-off-by: vegetableysm <[email protected]>
@vegetableysm vegetableysm changed the title [WIP]Revise perfect hash to align with libgrape-lite's pthash Revise perfect hash to align with libgrape-lite's pthash Apr 1, 2024
Signed-off-by: vegetableysm <[email protected]>
Copy link
Contributor

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
@vegetableysm
Copy link
Collaborator Author

Refer to #1992

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use libgrape-lite's pthash to replace BBHash
2 participants