-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-52226: Implement XS_DIRECTORY_PART
call
#2
CP-52226: Implement XS_DIRECTORY_PART
call
#2
Conversation
client support for this call in mirage/ocaml-xenstore will come separately |
NOTE: cache size is said to be configurable but isn't yet - I wasn't sure if it's worth doing it and if so, through what means. Whatever the size, it doesn't stop any calls from making progress, it just makes them potentially slower (if you are doing X+1 chains of calls with different nodes at the same time) |
let child_length = ref 0 in | ||
let i = ref offset in | ||
let cache_length = String.length children in | ||
while !buffer_length + !child_length < maxlen && !i < cache_length do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be written as a fold to avoid the mutations. I think the goal is to add items, but only entire items as long as they fit, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about expressing this in a nicer way, but all of the Array.fold, iter and such are implemented with mutations anyway, and I need to mutate several things so that the niceness isn't particularly maintained anyhow
oxenstored/transaction.ml
Outdated
| Some (_, generation_count, _, _) -> | ||
Int64.add generation_count 1L | ||
| None -> | ||
1L |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to remember what the largest previous value was and start there, or some other way to ensure it is unique. Otherwise if we have to evict something from the cache (because the size got exceeded), and we readd it, then it'll go back to 1 and the client may not notice that it has changed.
We could store a highest-evicted-counter, or a counter of how many operations we've done on the connection cache, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that, but then if the cache entry was evicted, the client will be forced to restart the sequence of calls as the generation count will have some new value. This is better than being oblivious to changes inbetween calls, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edwintorok could you please review the just-pushed fixup commit? I've created a per-connection generation count, so being pushed out of the cache will increment the generation count and force the client to restart the sequence of calls.
in | ||
String.blit generation_s 0 buffer 0 genlen ; | ||
String.blit children offset buffer genlen !buffer_length ; | ||
if last_chunk then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the cached entry if this was the last chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the connection is kept, and another sequence of calls are issued, it's still useful
XS_DIRECTORY_PART
callXS_DIRECTORY_PART
call
oxenstored/connection.ml
Outdated
; transactions= Hashtbl.create 5 | ||
; directory_cache | ||
; next_tid= initial_next_tid | ||
; watches= Hashtbl.create 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the size here indicating an intent? It's an implementation concern but it's basically redundant to give small an initial size of < 16
. These will all start with the same initial size.
These changes passed Ring3 BST+BVT tests (Run ID #207701) |
339f065
to
680ccc8
Compare
680ccc8
to
427fbe7
Compare
427fbe7
to
7b90ff6
Compare
Still passing BST+BVT with the fixup commit ✅ |
Are there outstanding changes or reasons not to approve this? @edwintorok @psafont |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The per connection directory generation count looks OK now.
… per-connection cache Implements XS_DIRECTORY_PART (originally specified in https://lore.kernel.org/xen-devel/[email protected]/). Like in CXenstored, it's limited to dom0. It is intended to list directories larger than the size of a single payload (4096 bytes) over several calls (not necessarily inside a single transaction). To alleviate the impact of the differences in how node's children are represented between C and OCaml Xenstoreds, implements a per-connection cache allowing the string representation of node's children to be generated once and reused later as long as the underlying list of children doesn't change (and the connection is reused). Whereas CXenstore uses a global generation count, modified at transaction creation and node changes, the OCaml version relies on comparisons of immutable trees. A per-node generation count is instead faked in the connection cache - it allows clients to know that the underlying information has changed and they should restart the sequence of partial directory calls to get a consistent picture. (i.e., while the generation count returned by CXenstore is globally unique, this is not useful for this particular function, since generation count is only compared against the results of previous invocations of partial directory calls on the same node) The cache is refreshed by OXenstored itself, and is limited to 8 entries (by default, configurable). A direct translation of the C code when it comes to string handling and allocation. Signed-off-by: Andrii Sultanov <[email protected]>
Original XSA fix only returns XS_ERROR payload with E2BIG inside for domU connections. libxenstore, however, expects an E2BIG response for proper handling of partial directory calls, for dom0 as well. In practice this removes the possibility for a "large packets" patch: https://github.com/xenserver/xen.pg/blob/9cde7213128d777c4856d719c10945c052df0ce1/patches/oxenstore-large-packets.patch It should be now superseded with the proper partial directory support. Signed-off-by: Andrii Sultanov <[email protected]>
Signed-off-by: Andrii Sultanov <[email protected]>
7b90ff6
to
623e263
Compare
Squashed the fixup commit, merging |
CP-52226 - oxenstored: Implement partial directory call for dom0 with per-connection cache
Implements
XS_DIRECTORY_PART
(originally specified inhttps://lore.kernel.org/xen-devel/[email protected]/).
Like in CXenstored, it's limited to dom0. It is intended to list
directories larger than the size of a single payload (4096 bytes) over
several calls (not necessarily inside a single transaction).
To alleviate the impact of the differences in how node's children are
represented between C and OCaml Xenstoreds, implements a per-connection
cache allowing the string representation of node's children to be generated
once and reused later as long as the underlying list of children doesn't change
(and the connection is reused).
Whereas CXenstore uses a global generation count, modified at
transaction creation and node changes, the OCaml version relies on
comparisons of immutable trees. A per-node generation count is instead
faked in the connection cache - it allows clients to know that the
underlying information has changed and they should restart the sequence
of partial directory calls to get a consistent picture.
(i.e., while the generation count returned by CXenstore is globally unique,
this is not useful for this particular function, since generation count
is only compared against the results of previous invocations of partial
directory calls on the same node)
The cache is refreshed by OXenstored itself, and is limited to 8 entries
(by default, configurable).
A direct translation of the C code when it comes to string handling and
allocation.
CP-52226 - oxenstored: Return XS_ERROR (E2BIG) on socket connections too
Original XSA fix only returns XS_ERROR payload with E2BIG inside for
domU connections. libxenstore, however, expects an E2BIG response for
proper handling of partial directory calls, for dom0 as well.
In practice this removes the possibility for a "large packets" patch:
https://github.com/xenserver/xen.pg/blob/9cde7213128d777c4856d719c10945c052df0ce1/patches/oxenstore-large-packets.patch
It should be now superseded with the proper partial directory support.
====
This has been tested on large directories, where Oxenstored used by C clients (that did not know about partial packets, did not handle larger packets) no longer errors out and complies with the existing protocol:
I've also confirmed that the generation count is incremented on changes to node's children list between calls, and that the cache does indeed serve its purpose.