-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Serializer interface for use in ehcache disk tier and elsewhere;… #9
base: sagars-ehcache-final
Are you sure you want to change the base?
Conversation
… modifies existing disk tier impl to use it in a generic way Signed-off-by: Peter Alfonsi <[email protected]> Fixed byte[] key implementation to use ByteBuffer wrapper passed directly to ehcache Signed-off-by: Peter Alfonsi <[email protected]> Added tests for BytesReference serializer, and ehcache disk tier using BytesReference as a value
/** | ||
* This ehcache disk caching tier uses its value serializer outside ehcache. | ||
* Values are transformed to byte[] outside ehcache and then ehcache uses its bundled byte[] serializer. | ||
* The key serializer you pass to this class produces a byte[]. This serializer is passed to a wrapper which | ||
* implements Ehcache's serializer implementation and produces a BytesBuffer. The wrapper instance is then passed to ehcache. | ||
* This is done because to get keys on a disk tier, ehcache internally checks the equals() method of the serializer, | ||
* but ALSO requires newKey.equals(storedKey) (this isn't documented), which is the case for ByteBuffer but not byte[]. | ||
* This limitation means that the key serializer must preserve the class of the key before/after serialization, | ||
* but the value serializer does not have to do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This detailed explanation is not needed that too around serialization. Just keep it generic. Any explanation around serialization if needed, keep it short.
new RemovalNotification<>( | ||
event.getKey(), | ||
valueSerializer.deserialize(event.getOldValue()), | ||
RemovalReason.EVICTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I missed it earlier, can you also add TierType.DISK as an argument to this constructor.
event.getKey(), | ||
valueSerializer.deserialize(event.getOldValue()), | ||
RemovalReason.INVALIDATED | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I missed it earlier, can you also add TierType.DISK as an argument to this constructor.
public class EhCacheDiskCachingTier<K, V> implements DiskCachingTier<K, V> { | ||
|
||
// A Cache manager can create many caches. | ||
private final PersistentCacheManager cacheManager; | ||
|
||
// Disk cache | ||
private Cache<K, V> cache; | ||
private Cache<K, byte[]> cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks confusing. Where we take key as generic K but value as byte[]. Ideally we need to serialize both eventually. Plus it can happen that both key and value maybe of same type like ByteReference, String etc, so having this hybrid doesn't make sense.
Better to keep both byte[] or use generic for value as well and define a valueWrapper like keyWrapper.
Let me know it it make sense.
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Description
Added Serializer interface to Sagar's ehcache disk tier implementation. We pass the disk tier a key and a value serializer which serialize objects to byte[].
The value serializer is used outside of ehcache, in the disk tier's put() and get(). This saves us from having to serialize BytesReference objects such that their class is the same before and after serialization, which would have been very hacky and potentially damaged performance in other places in OpenSearch. (BytesReference is an interface and its implementing classes are not always simple to instantiate). Ehcache requires this class constraint, but the rest of OS does not care which type of BytesReference something is, so we can move our serializer outside ehcache for a cleaner implementation.
The key serializer is put into a wrapper that outputs ByteBuffer objects and implements ehcache's serializer interface. The wrapper instance is then passed directly to ehcache. We cannot use byte[] directly as a key, despite ehcache having a bundled serializer for it, because when searching for a key it checks both the equals() method of the bundled serializer and also the equals() method of the keys themselves (this is not documented). This means the key serializer must preserve class before and after serialization, but we don't expect this to be a problem.
The implementation is generic and can be used for any EhcacheDiskCachingTier<K, V> so long as serializer implementations for K and V are provided.
Related Issues
Part of larger tiered caching feature.
Check List
- [N/A] New functionality has javadoc added- [N/A] Commit changes are listed out in CHANGELOG.md file (See: Changelog)- [N/A] Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.