Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A callback for re-fetching the root ca in the aggregator #1315

Conversation

yontyon
Copy link
Contributor

@yontyon yontyon commented Jan 27, 2025

A collaborator may exchange its root ca during a restart, especially if the collaborator is enclavized and uses a self-signed certificate. In this case, we need to provide a mechanism for the aggregator to fetch the client's root ca (the self-signer certificate) (in practice, it can be fetched from some 3rd trusted certificates store or governor).

In this implementation, the aggregator fetches the new root ca only when a TLS handshake starts, but it's ok as the collaborators start new connection for every new request (see https://github.com/securefederatedai/openfl/blob/develop/openfl/transport/grpc/aggregator_client.py#L136)

The change was tested by restarting the custom collaborator and creating a new self-signed cert on every restart.

Copy link
Collaborator

@teoparvanov teoparvanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please add more details in the PR description:

  • Motivation
  • Limitations (the collaborator's cert is only re-fetched on TLS handshake)
  • Testing done

openfl/transport/grpc/aggregator_server.py Outdated Show resolved Hide resolved
@@ -81,6 +84,7 @@ def __init__(
self.server_credentials = None

self.logger = logging.getLogger(__name__)
self.clients_certs_refresher_cb = clients_certs_refresher_cb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this callback supposed to be? And where is it defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a description to the docstring. Generally, it is supposed to read the root ca every time a client starts TLS handshake with the aggregator. This allows the aggregator to get fresh clients certs in case they were rotated

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a sense of what the PR is about, but not quite at the implementation level. Could you point to the code to this callback? (Since you mention this is a function passed as an arg, the function will likely be defined somewhere?)

I see a possibility of design refinement, like re-use of openfl.callbacks API, or the fact of always passing a callback and not supplying root CA...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this callback is useful only where the rootca change (like in a migrated enclvized node). There is no immediate example in openfl (as the certificates are fixed in the current examples), however, a usage example can be found here (closed source https://github.com/intel-innersource/frameworks.ai.openfl.openfl-security/pull/849/files).
Regarding the design refinement:

  1. IMO the communication layer needs to be as separated as we can from the openfl layer. It's true that currently, we use a grpc that is tightly coupled with the openfl implementation but it won't necessarily be the case in the future (for example, we can have an abstract definition of the communication layer and multiple implementations (UDP, TLS, RA-TLS, GRPC, REST, etc.), hence I don't think we should mix the openfl interfaces (such as the callbacks) with the communication layer - it may create cyclic dependency and make it harder to separate the two in the future.
  2. Regarding always using the callback, generally, a rootca does not frequently change (here it may change only due to the self-signing and the enclavized components - the rootca is simply a chain of the different clients' certs), hence I think it is more intuitive from the library perspective to provide a fixed rootca and let advanced users define a specific callback if it is required.

Having said all that, the approach you suggested is valid, and we can follow it. However, at the current level, we should keep it as simple as possible and refine the implementation, if needed, while moving forward.

Copy link
Collaborator

@teoparvanov teoparvanov Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU, the openfl.callback framework mainly targets user-defined actions that can be "hooked" at specific stages of the FL plan execution. The callback mentioned here is at the lower, transport protocol level. We can consider re-aligning this later, but I suggest to proceed with @yontyon's suggested approach for now.

@MasterSkepticista
Copy link
Collaborator

MasterSkepticista commented Jan 27, 2025

P.S.: Please shrink the PR title before merging. Echoing @teoparvanov's comment, please use the description box for details on what/why/how.

@yontyon yontyon force-pushed the ybuchnik/sync-agg-with-latest-clients-certs branch from 1f0432c to 98af260 Compare January 27, 2025 19:14
@yontyon yontyon changed the title Sync the aggregator with the collaborators certs on every new TLS handshake Introducing a method for re-fetching the root ca in the aggregator Jan 27, 2025
Signed-off-by: Buchnik, Yehonatan <[email protected]>
Signed-off-by: Buchnik, Yehonatan <[email protected]>
@yontyon yontyon force-pushed the ybuchnik/sync-agg-with-latest-clients-certs branch from 98af260 to 610ef26 Compare January 28, 2025 10:25
@teoparvanov teoparvanov changed the title Introducing a method for re-fetching the root ca in the aggregator Method for re-fetching the root ca in the aggregator Jan 28, 2025
@teoparvanov teoparvanov changed the title Method for re-fetching the root ca in the aggregator A callback for re-fetching the root ca in the aggregator Jan 28, 2025
@teoparvanov teoparvanov merged commit 7f90f42 into securefederatedai:develop Jan 28, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants