Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libzdb: rewrite index memory handling #127

Draft
wants to merge 8 commits into
base: development-v2
Choose a base branch
from

Conversation

maxux
Copy link
Collaborator

@maxux maxux commented Dec 20, 2021

Previous index memory management was basically one large array of 128 MB and each block of 8 bytes was a pointer to a linked-list of entries.

Each entry was the object entry with an extra link for the namespace because all namespace were sharing the same master-array.

To find the correct list index, the crc32 of the key were used and the first 24 bits was the array-id, this method was really fast but needed to maintains a unique full large shared buffer. When removing a namespace, deleting process could be very long to iterate over all keys (for all namespaces) even if namespace contains few keys.

Real memory usage was quite effcient actually, thanks Linux, only effective used pages was allocated, so the large 128 MB array was dynamically allocated by the kernel but on really large database, limit can quickly come and sharing the array was anyway not a good idea IMO.

Here is a new implementation, in test. This new implementation use a dedicated index map per namespace, the namespace pointer per entry is thus not needed anymore, it's more logic and less error prone, no sharing across namespace anymore.

When deleting a namespace, the complete isolated map can be free'd without interfering with other keys/namespace.

There is still a crc32 computing on the key and that value will point to an array entry aswell, but this array become multi-level and levels are allocated dynamically. Each level is composed of an array of 16 pointers. This version use 5 level (20 bits out of 32). This is configurable.

Test were made on 2 millions keys namespace, memory usage can be reduced up to 6 times than before and is more predictible (eg: 2 times the same namespace will use 2 times the memory). More test to come, performance doesn't seems to be affected a lot, but it's slower (lookup and allocation are slower, but difference is not really noticable).

Previous index memory management was basically one large array of 128 MB
and each block of 8 bytes was a pointer to a linked-list of entries.

Each entry was the object entry with an extra link for the namespace
because all namespace were sharing the same master-array.

To find the correct list index, the crc32 of the key were used and the
first 24 bits was the array-id, this method was really fast but needed
to maintains a unique full large shared buffer. When removing a
namespace, deleting process could be very long to iterate over all keys
(for all namespaces) even if namespace contains few keys.

Real memory usage was quite effcient actually, thanks Linux, only
effective used pages was allocated, so the large 128 MB array was
dynamically allocated by the kernel but on really large database, limit
can quickly come and sharing the array was anyway not a good idea IMO.

--

Here is a new implementation, in test. This new implementation use a
dedicated index map per namespace, the namespace pointer per entry is
thus not needed anymore, it's more logic and less error prone, no
sharing across namespace anymore.

When deleting a namespace, the complete isolated map can be free'd
without interfering with other keys/namespace.

There is still a crc32 computing on the key and that value will point to
an array entry aswell, but this array become multi-level and levels are
allocated dynamically. Each level is composed of an array of 16
pointers. This version use 5 level (20 bits out of 32). This is
configurable.

--

Test were made on 2 millions keys namespace, memory usage can be reduced
up to 6 times than before and is more predictible (eg: 2 times the same
namespace will use 2 times the memory). More test to come, performance
doesn't seems to be affected a lot, but it's slower (lookup and
allocation are slower, but difference is not really noticable).
When removing a namespace, clients attached to it in the server should
be notified that the namespace is not available anymore, this was
supported but not backported when server and library code were
separated.

This fixes #130
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant