-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add KEP for AddOnPlacementScoreGenerator #93
base: main
Are you sure you want to change the base?
Add KEP for AddOnPlacementScoreGenerator #93
Conversation
f9cf88f
to
8031826
Compare
addOnPlacementSelector: | ||
name: latency | ||
namespace: cluster1 | ||
latencies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is, should this field latencies
be a more general field? and how could a user define a URL here, what's the requirement for the URL and the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @haoqing0110 yes, other than more generic to me this would be "any string" and even we could have more than one, something like this for example:
apiVersion: cluster.open-cluster-management.io/v1alpha1
kind: AddOnPlacementScoreGenerator
metadata:
name: cluster1-generator
namespace: cluster1
spec:
addOnPlacementSelector:
name: latency
namespace: cluster1
latencies:
- name: redhat-com-avgLatency
url: https://redhat.com
runs: 2
waitBetweenRuns: 10s
- name: google-com-avgLatency
url: https://google.com
- name: linux-com-avgLatency
url: https://linux.com
energy:
- name: some-name
someValue: some-value
Then, the controller interested in latencies will consume latencies
, the controller interested in energy
will consume energy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. This AddOnPlacementScoreGenerator
sounds like a configuration for the Addon controller, and to generate different scores, developers still need to develop their own addon controller, correct? Sounds similar to the idea of https://github.com/open-cluster-management-io/enhancements/tree/main/enhancements/sig-architecture/58-addon-configuration, not sure if this can satisfy the motivation of this proposal, and this proposal is like to propose a new addon configuration called AddOnPlacementScoreGenerator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'm wondering could the controller to consume latencies
and energy
could be a unified controller, it can consume different kind of data source, eg, an URL https://redhat.com
or some host path /some/path
, or some other things. And this API just defines the supported data source types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @qiujian16 , want to know your suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haoqing0110 the use cases defined by 58-addon-configuration would work for us, I have checked the examples and the example with ManagedProxyConfiguration
is very similar to what we want.
And answering your second comment, yes, the same controller could use latencies
and energy
data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is more like a cluster scoped resource that applies to all clusters, it is less likely that we want different configs for different clusters. Also how could we parse the result of the http responses? It might be straightforward for latency. But if we try to fetch data from a metrics server or 3rd party services to generate the score, the returned output varies and we might need a way to define how to parse the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @qiujian16,
I think that for latencies global object or per-cluster object would both work. In terms of data, I was thinking of an unstructured structure where people developing custom controllers with the addon framework will consume the data the way they need it. I don't think this should be a controller taking care of multiple data sources, instead a single controller taking care of a data source, so if anything were wrong with the data itself it wouldn't affect the other add-ons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on Mario's approach on a single controller taking care of a single data source type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding, in this API we should define
- how to get the raw data. It can not be only latency. It probably could be a field
type
and latency could be a certain type to get raw data. - how to convent raw data to the score
Signed-off-by: Mario Vazquez <[email protected]>
8031826
to
2bf05f9
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mvazquezc The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put some comment, but I haven't thought through the whole idea yet. Will take some time this week and add more comments.
name: cluster1-generator | ||
namespace: cluster1 | ||
spec: | ||
addOnPlacementSelector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this actually the name of the score?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really, it's the name of the AddOnPlacementScore
. This selector would be used by the controller generating the AddOnPlacementScore
to get the AddOnPlacementScoreGenerator
that applies to this specific AddOnPlacementScore
.
|
||
~~~yaml | ||
apiVersion: cluster.open-cluster-management.io/v1alpha1 | ||
kind: AddOnPlacementScoreGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe AddOnPlacementScoreDataSource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the comment below:
- I don't think we want to create types... I was thinking of an unstructured data type where the controller consuming the API should take care of reading the fields it expects to be present in order to generate the required scores.
- Controller should take care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not suggest unstructured type, it will lose the schema validation and make the API unbounded.
addOnPlacementSelector: | ||
name: latency | ||
namespace: cluster1 | ||
latencies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding, in this API we should define
- how to get the raw data. It can not be only latency. It probably could be a field
type
and latency could be a certain type to get raw data. - how to convent raw data to the score
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not convinced that this should be put in the API repo. The major uncertainty is how this API can be evolved and consumed by multiple score providers. It seems to me a new score provider has to update this API or build their own CRDs to config how score is generated. It does not seems change the fact that AddOnPlacementScore is the only integration point. But I am 100% ok if we create a separated repo and maintain this API and controller there.
|
||
~~~yaml | ||
apiVersion: cluster.open-cluster-management.io/v1alpha1 | ||
kind: AddOnPlacementScoreGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not suggest unstructured type, it will lose the schema validation and make the API unbounded.
I'm not sure I follow you. The idea behind this CRD is to allow the customization of score generations by introducing a new API that can be consumed by the addon controllers, I don't think we want to have a dedicated controller for this API. Multiple score providers will be required to read this API in case they want to get score-generation customizations. In case these score providers don't need customizations they're fine not reading this CRD. To me, this API would be like having your config in a ConfigMap and reading it upon controller start. |
I think I understand the intention, but I do not see how it can be achieved when a new provider comes without adding additional field in this API. For example, if I have a new provider comes and allow user to configure it to score the cluster based on resource usage, I have to update the API spec or introduce a new API to config. I do not see how you can define a common api schema that can adapt to different providers. |
That's why I think it should be unstructured, so anyone can put their customization data without having to change the API. The controllers are responsible for reading their data and making sure that in case of misconfiguration, they can gracefully handle it. |
if it is unstructured, why we need to define an API schema instead of directly using a configmap :)? I think we need to at least define some general concept in this API rather than arbitrary type...This is why I am struggling with how this API can provide commonality in this scenario. |
It could be done via ConfigMaps, we won't be able to leverage things defined in the KEP like I cannot think of a common structured API that can be leveraged without having to update API types for every new provider. Requirements for different data providers may be very different and as such, hard to put in common. |
As discussed at the community meeting on April 6th, this is the KEP for adding the
AddOnPlacementScoreGenerator
API.