-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A callback for re-fetching the root ca in the aggregator #1315
A callback for re-fetching the root ca in the aggregator #1315
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please add more details in the PR description:
- Motivation
- Limitations (the collaborator's cert is only re-fetched on TLS handshake)
- Testing done
@@ -81,6 +84,7 @@ def __init__( | |||
self.server_credentials = None | |||
|
|||
self.logger = logging.getLogger(__name__) | |||
self.clients_certs_refresher_cb = clients_certs_refresher_cb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this callback supposed to be? And where is it defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a description to the docstring. Generally, it is supposed to read the root ca every time a client starts TLS handshake with the aggregator. This allows the aggregator to get fresh clients certs in case they were rotated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a sense of what the PR is about, but not quite at the implementation level. Could you point to the code to this callback? (Since you mention this is a function passed as an arg, the function will likely be defined somewhere?)
I see a possibility of design refinement, like re-use of openfl.callbacks
API, or the fact of always passing a callback and not supplying root CA...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this callback is useful only where the rootca change (like in a migrated enclvized node). There is no immediate example in openfl (as the certificates are fixed in the current examples), however, a usage example can be found here (closed source https://github.com/intel-innersource/frameworks.ai.openfl.openfl-security/pull/849/files).
Regarding the design refinement:
- IMO the communication layer needs to be as separated as we can from the openfl layer. It's true that currently, we use a grpc that is tightly coupled with the openfl implementation but it won't necessarily be the case in the future (for example, we can have an abstract definition of the communication layer and multiple implementations (UDP, TLS, RA-TLS, GRPC, REST, etc.), hence I don't think we should mix the openfl interfaces (such as the callbacks) with the communication layer - it may create cyclic dependency and make it harder to separate the two in the future.
- Regarding always using the callback, generally, a rootca does not frequently change (here it may change only due to the self-signing and the enclavized components - the rootca is simply a chain of the different clients' certs), hence I think it is more intuitive from the library perspective to provide a fixed rootca and let advanced users define a specific callback if it is required.
Having said all that, the approach you suggested is valid, and we can follow it. However, at the current level, we should keep it as simple as possible and refine the implementation, if needed, while moving forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, the openfl.callback
framework mainly targets user-defined actions that can be "hooked" at specific stages of the FL plan execution. The callback mentioned here is at the lower, transport protocol level. We can consider re-aligning this later, but I suggest to proceed with @yontyon's suggested approach for now.
P.S.: Please shrink the PR title before merging. Echoing @teoparvanov's comment, please use the description box for details on what/why/how. |
1f0432c
to
98af260
Compare
…y TLS handshake Signed-off-by: Buchnik, Yehonatan <[email protected]>
Signed-off-by: Buchnik, Yehonatan <[email protected]>
Signed-off-by: Buchnik, Yehonatan <[email protected]>
98af260
to
610ef26
Compare
A collaborator may exchange its root ca during a restart, especially if the collaborator is enclavized and uses a self-signed certificate. In this case, we need to provide a mechanism for the aggregator to fetch the client's root ca (the self-signer certificate) (in practice, it can be fetched from some 3rd trusted certificates store or governor).
In this implementation, the aggregator fetches the new root ca only when a TLS handshake starts, but it's ok as the collaborators start new connection for every new request (see https://github.com/securefederatedai/openfl/blob/develop/openfl/transport/grpc/aggregator_client.py#L136)
The change was tested by restarting the custom collaborator and creating a new self-signed cert on every restart.