-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(MADR): resource identifier format #12756
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ilya Lobkov <[email protected]>
Reviewer Checklist🔍 Each of these sections need to be checked by the reviewer of the PR 🔍:
|
Signed-off-by: Ilya Lobkov <[email protected]>
Signed-off-by: Ilya Lobkov <[email protected]>
| VirtualHost | legacy listeners - `<kuma.io/service>`<br>new outbounds - `<mesh>_<name>_<namespace>_<zone>_<short-name>_<port>` | Mesh*Service (with sectionName to select port) | | ||
| Inbound Cluster | `localhost:<port>` | Dataplane (with sectionName to select port) | | ||
| Outbound Cluster | legacy clusters - `<kuma.io/service>-hash(dst.tags)`<br>legacy clusters cross-mesh - `<kuma.io/service>-hash(dst.tags)_<mesh>`<br>new clusters - `<mesh>_<name>_<namespace>_<zone>_<short-name>_<port>` | Mesh*Service (with sectionName to select port) | | ||
| Route | Routes are set on Listener on VirtualHost.<br>On inbound - `inbound:<kuma.io/service>`<br>On outbound - `<hash_sha256([]Match{...})>` | Correlates with a set of MeshHTTPRoutes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about matches
section in new inbound policies api? We might not have a route as a resource but create route from policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for outbound routes it says Correlates with a set of MeshHTTPRoutes
. So it's not really clear how to name the route without hashing
There is an identifier format from Amazon called [ARN](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html). We can adopt a similar approach, but using `_`: | ||
|
||
``` | ||
kri_<mesh>_<zone>_<namespace>_<resource-type>_<resource-name>_<section-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we considered having some placeholder for missing values? to avoid multiple _ which can be hard to read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to be explicit about resource-type. Is it the plural/camlCase...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have the resource type before the mesh or at least before the zone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to be explicit about resource-type. Is it the plural/camlCase...?
It should be a lowercased singular name as we use in kumactl
, i.e. meshservice
or meshtimeout
.
Why not have the resource type before the mesh or at least before the zone?
I kind of like how <resource-type>
is standing next to <resource-name>
, i.e. kri_default___meshservice_backend
. Type and name are always present and I think it's easier to catch what identifier is referring to. Compare with kri_meshservice_default__backend
. You might think for a sec the meshservice
is called default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we considered having some placeholder for missing values? to avoid multiple _ which can be hard to read?
what would you use as a placeholder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have the resource type before the mesh or at least before the zone?
I kinda agree with is, if the resource-type is before anything else, then you can immediately tell whether to expect say an empty mesh in the case of the kri pointing to a resource type that doesn't have a mesh
Signed-off-by: Ilya Lobkov <[email protected]>
|
||
#### [Issue #12093](https://github.com/kumahq/kuma/issues/12093): xds configs, outbound listeners should use the clustername instead of an IP/port combo | ||
|
||
We name outbounds like `outbound:10.43.205.116:6379` where IP address doesn't give any useful information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even more when there are multiple IPs right?
Co-authored-by: Charly Molter <[email protected]> Signed-off-by: Ilya Lobkov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff
Also, there was [work](https://docs.google.com/document/d/1OIZK82Tr-4El2FfdlBn7WNRZ7FatkTuEcZKH0FlSTMA/edit?tab=t.0#heading=h.n6cmlf1eel2z) related to Envoy cluster name unification, but it's not finished. | ||
Discoveries in this work helped me to fill the tables. | ||
|
||
There are no restriction on the name format from the Envoy's side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even in length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Envoy doesn't specify a length limit. I tried to create a cluster with the max expected length of resource identifier
253(name) + 253(zone) + 63(mesh) + 63(namespace) + 15(sectionName) + 30(resourcetype) + 3(kri) + 6(_) = 686
and it worked as expected.
There is an identifier format from Amazon called [ARN](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html). We can adopt a similar approach, but using `_`: | ||
|
||
``` | ||
kri_<mesh>_<zone>_<namespace>_<resource-type>_<resource-name>_<section-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to be explicit about resource-type. Is it the plural/camlCase...?
There is an identifier format from Amazon called [ARN](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html). We can adopt a similar approach, but using `_`: | ||
|
||
``` | ||
kri_<mesh>_<zone>_<namespace>_<resource-type>_<resource-name>_<section-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have the resource type before the mesh or at least before the zone?
| Inbound Listener | `inbound:10.43.205.116:8080`<br>`inbound:[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:8080` | Dataplane (with sectionName to select port) | | ||
| Outbound Listener | `outbound:10.43.205.116:8080`<br>`outbound:[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:8080` | Mesh*Service (with sectionName to select port) | | ||
| VirtualHost | legacy listeners - `<kuma.io/service>`<br>new outbounds - `<mesh>_<name>_<namespace>_<zone>_<short-name>_<port>` | Mesh*Service (with sectionName to select port) | | ||
| Inbound Cluster | `localhost:<port>` | Dataplane (with sectionName to select port) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local envoy clusters should have a better name than localhost_
Do we still use localhost
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the function GetLocalClusterName
is used in multiple places
func GetLocalClusterName(port uint32) string { |
|
||
| | Name | Correlated Resources | | ||
|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------| | ||
| Cluster | `meshpassthrough_<protocol>_<match-value>_<port>`<br>when `<port> == 0` Kuma sets port equal to `*`<br>`match-value = CIDR \| IP \| Domain`<br>`CIDR = i.e. "192.0.2.0/24" or "2001:db8::/32"`<br>`IP = i.e. "192.0.2.1", or 2001:db8::68", or ::ffff:192.0.2.1"`<br>`Domain = <dns-name> \| *.<dns-name>`<br>`dns-name = ^([a-zA-Z0-9_]{1}[a-zA-Z0-9_-]{0,62}){1}(\.[a-zA-Z0-9_]{1}[a-zA-Z0-9_-]{0,62})*[\._]?$`<br> | – | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passthroughMode: (Optional) Defines behaviour for handling traffic. Allowed values: All, None and Matched. Default: None
What does the identifier should be if ?All
and None
Our previous default allow in/outbound cluster names are:
inbound:passthrough:ipv4
inbound:passthrough:ipv6
outbound:passthrough:ipv4
outbound:passthrough:ipv6
it's not about meshpassthrough resource, so we just keep using these right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we actually removing them when MeshPassthrough is used
func removeDefaultPassthroughCluster(rs *core_xds.ResourceSet) { |
### Places to use resource identifier | ||
|
||
#### URL path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be that the resource identifier is also being used in URL search query, i.e. for filtering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it's not a hard requirement, I added as a note what resource identifier and delimiter charset would look like if we wanted to support URL query:
resource-identifier = *(ALPHA / DIGIT / "-" / "." / "_" / "~" )
delimiter = "_" / "~"
they're significantly smaller than those without query support.
But the good news is if we go with _
then the resource identifier can be used in a query. So I added this to the Pros
list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thank you 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstood the conversation here, but would it be the case that if we were using these in a URL anywhere we would URL encode them first anyway? i.e. any non-URL safe chars would be %
ified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(P.S. but one small benefit of not having non-URL safe characters is that it keeps the identifier "pretty" i.e. more recognisable, so a benefit but not super important I would say)
actually I guess if the one of the primary usecases is for people to type these to get things (rather than usage within the GUI), we don't want them having to URL encode things manually.
kinda swung back and forth on opinion here, sorry for the noise! 😅 please ignore me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still good point @johncowen, but as you said I also think generally it'd be better to not use any non-URL safe chars.
Also depending on what people/tools are using, there might be a difference on which chars are encoded (i.e. encodeURI
vs encodeURIComponent
):
encodeURI("http://localhost:1234?filter[foo&bar]=baz") // -> 'http://localhost:1234?filter%5Bfoo&bar%5D=baz'
encodeURIComponent("filter[foo&bar]=baz") // -> 'filter%5Bfoo%26bar%5D%3Dbaz'
encodeURI() escapes all characters except:
A–Z a–z 0–9 - _ . ! ~ * ' ( ) ; / ? : @ & = + $ , #
The characters on the second line are characters that may be part of the URI syntax, and are only escaped by encodeURIComponent(). Both encodeURI() and encodeURIComponent() do not encode the characters -.!~*'(), known as "unreserved marks", which do not have a reserved purpose but are allowed in a URI "as is". (See RFC2396)
encodeURIComponent() uses the same encoding algorithm as described in encodeURI(). It escapes all characters except:
A–Z a–z 0–9 - _ . ! ~ * ' ( )
Compared to encodeURI(), encodeURIComponent() escapes a larger set of characters. Use encodeURIComponent() on user-entered fields from forms POST'd to the server — this will encode & symbols that may inadvertently be generated during data entry for character references or other characters that require encoding/decoding. For example, if a user writes Jack & Jill, without encodeURIComponent(), the ampersand could be interpreted on the server as the start of a new field and jeopardize the integrity of the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah gotcha, couple of follow ups:
I'd say people should always use encodeURIComponent
otherwise they are "holding it wrong" and should change implementation, and I suppose this is my point on expecting people to do always do this how we expect/correctly, there will always be cases where some folks might be "holding it wrong". To be fair, all "values" used in a URL should be using encodeURIComponent
(or non-JS equivalent) anyway, whether they are expected to be safe or not. So there's also something to be said for not worrying about URL safety, if someone isn't using a correct encoder, they should be. But I'm thinking more about informal "I just want to curl the thing to get a response in my terminal" type of usage. There's benefit in not forcing people to have to encode the thing if for example we chose to use /
in this case.
So if we want URL safe if only for reasons of "don't make it hard for people to just curl the thing" all in all according to the MADR, that leaves us with:
delimiter = "_" / "~"
And it sounds like we've landed on a _
, which is URL safe which is super duper. It's probably a good idea to note that I have seen instances of people using these in hostnames even though a _
shouldn't be used in hostnames.
@lobkovilya I'm not sure if we validate things like mesh names and zone names to not have _
, I might be misremembering but do I remember that at least at one point this was possible? Is it definitely not possible to have a mesh/zone name with a _
in it now?
Just a little side note that's just occurred to me, I'm kinda glad we still have this last "safe character" available ~
, which in a past life has been super useful to have as a usable/meaningful character (i.e. similar to ~/johncowen
), which kinda means "expand ~
to a common string we know about". You never know we might hit a thing at somepoint where we need the same "trick".
Signed-off-by: Ilya Lobkov <[email protected]>
…o docs/madr-70
Signed-off-by: Ilya Lobkov <[email protected]>
Signed-off-by: Ilya Lobkov <[email protected]>
Signed-off-by: Ilya Lobkov <[email protected]>
Maybe not part of the MADR and might just be an informal example, but!
When we eventually come to define the endpoint should we include the fact that we are specifically requesting via a Super edge case, but who knows if we are ever gonna change the way we define identifiers. but maybe I'm over thinking it 🤷 |
Motivation
The goal is to improve Inspect API and introduce an identifier as part of the URL path, i.e
:5681/_rules/<identifier>
. See the discussionBetter to review the rendered version as it contains tables.