Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix description vs document terminology #4100

Open
wants to merge 6 commits into
base: v3.0.4-dev
Choose a base branch
from

Conversation

mikekistler
Copy link
Contributor

This PR attempts to fully distinguish "description" and "document" by using "description" consistently for the concept of an API description in the OpenAPI format and "document" to only refer to structural features.

I also revised the "OpenAPI Description Structure" section a bit in part to address the above and also in an attempt to simplify and clarify that section.

Some specifics:

  • Every instance of "OpenAPI document" and "OpenAPI description document" has been eliminated. Most are changed to "OpenAPI Description".
  • All occurrences of "OpenAPI Description" capitalize the "D" in "Description".
  • I changed some instances of "OpenAPI Description" to "OAD", but there are still 33 occurrences of "OpenAPI Description" and at least some of these might be good candidates to change to "OAD".
  • the previous text defined "self-contained OpenAPI document" -- I changed that "single-document OpenAPI Description" which I think is clearer and simpler.
  • the previous text defined "syntactically complete OpenAPI document" -- I removed that as I believe that other wording changes I made made it unnecessary.

I think there are some good changes here but I'm definitely open to feedback / suggestions on how make it even better.

Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general ok, minor nits.

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
@ralfhandl ralfhandl requested a review from a team September 17, 2024 12:30
@ralfhandl ralfhandl requested a review from a team September 17, 2024 16:15
@handrews handrews added editorial Wording and stylistic issues clarification requests to clarify, but not change, part of the spec labels Sep 18, 2024
@handrews handrews added this to the v3.0.4 milestone Sep 18, 2024
Copy link
Member

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler I really appreciate your work here- I am surprised at how many things it has flushed out, and however we resolve those differing views the spec will be much stronger!

I feel a bit bad that I did not advise you to start with 3.1.1, which is what I did for every document parsing and referencing change (all of the others started in 3.0.4). 3.1 is much more complex, so I worked out what made sense there, and then backported the results to 3.0.

This is because 3.0 is, as I think @darrelmiller described, the "uncanny valley" between the 2.0 paradigm (the OAD should function as a single JSON/YAML document even if it is not) and the 3.1 paradigm (it's not, in general, possible to correctly parse only part of a document, and shared documents are expected to be "syntactically complete" but just have components and not paths or webhooks).

Most of the "this needs to be 'document'" stuff is related to supporting components-only "syntactically complete" documents in 3.1. We then want the wording to be as consistent as possible between 3.0 and 3.1, without accidentally imposing 3.1 requirements on 3.0. This is tricky, and is why some of the PRs in this area got re-written multiple times.

I think I'd recommend attempting a PR on 3.1.1 before revisiting this. I think it will clarify a lot of things for you. And me — clearly there is more work to do here than I realized, and I'm really glad you are surfacing it!

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
* As a syntactically complete OpenAPI Description document
* As the Object type implied by its parent Object within the document
* The root object of the entry document is interpreted as an OpenAPI Object
* As the Object type implied by its parent Object within the OpenAPI Description
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another one where it's about the document. "parent Object within the OAD" is ambiguous- do you mean the literal parent in the JSON/YAML structure or the logical parent which might be a reference depending on how you reached it? This section was intended to clarify that distinction and the problems it can cause.

Here we are talking about the JSON/YAML structure parent.

While the presence of the next bullet point implies that this one has to be about something other than referencing, I would prefer to emphasize the document-oriented nature of the parent/child context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of "OpenAPI Description" here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still "OpenAPI Document" for the same reasons I stated previously- it is literally about the parent-child JSON/YAML relationship within the Document, and never about any other possible relationship within the Description. That is the entire purpose of this statement, and without that, the distinction between this bullet point and the next doesn't make any sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example we discussed in the TDC meeting yesterday that I think swayed the group to conclude that "Description" is correct here is this:

entryDoc.json

{
  "openapi": "3.0.1",
  "info": {
    "title": "Multi-document OAD",
    "version": "1.0.0"
  },
  "paths": {
    "/path1": {
      "get": {
        "parameters": [
          {
            "$ref": "otherDoc.json#/param1"
          }
        ],
...

otherDoc.json

{
  "param1": {
    "name": "param1",
    "in": "query",
    "schema": {
      "type": "string"
    }
  }
}

In this example, the parent object of "param1" in otherDoc.json is just a Json object -- it implies nothing about how to interpret param1. Rather, the interpretation of "param1" comes from it being referenced within the parameters array of entryDoc.json -- it's parent object in the OpenAPI Description.

Copy link
Member

@handrews handrews Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler There are three bullet points here, which I will list with numbers even though they are not numbered in the text:

  1. The root object of the entry document is interpreted as an OpenAPI Object
  2. As the Object type implied by its parent Object within the OpenAPI Document
  3. As a reference target, with the Object type matching the reference source's context

Your example is point 3. Point 2 is also important - it's how parsing works before you resolve references. All three points are critical to properly handling OADs.

  1. You are told the entry document, and you start processing using context 1
  2. You parse that document, using the within-document parent-child relationships (context 2)
  3. You resolve references, evaluating reference targets using context 3

If the reference crosses to a different document, then, depending on how you wrote your parser (e.g. how Swagger's 3.0 parsing works vs how @darrelmiller says Microsoft's 3.0 parsing works), you can parse the same JSON/YAML object by both context 2 (which Microsoft uses even with 3.0 according to Darrel) and context 3 (which all implementations must use for proper reference resolution).

That is why the following paragraph exists:

If the same JSON/YAML object is parsed multiple times and the respective contexts require it to be parsed as different Object types, the resulting behavior is implementation defined, and MAY be treated as an error if detected. An example would be referencing an empty Schema Object under #/components/schemas where a Path Item Object is expected, as an empty object is valid for both types. For maximum interoperability, it is RECOMMENDED that OpenAPI Description authors avoid such scenarios.

Please stop trying to change context 2 to be the same as context 3. The entire point of the "Structural Interoperability" section is to point out that different implementations use these contexts and they don't always say the same thing.

Your change completely guts this section. Your example is about context 3. But context 2 is also critically important and cannot be removed. You literally cannot parse a document without it. How do you know that the child of the info field is an Info Object? Context 2 is what tells you that. There is no context 3 for that field.

Whenever someone writes a $ref to something that is also part of a well-formed OpenAPI Document, context 2 and 3 both apply and they can be different.

Your example only uses context 3. But that does not invalidate context 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Okay ... I see your point about the overlap between 2 and 3.

But I still think there is a problem with using "OpenAPI Document" in bullet 2, and I think it is illustrated by the same example as given above.

What is the context for the value of the "schema" field within otherDoc.json? If bullet 2 is about "OpenAPI Document", then it doesn't apply because otherDoc.json is not an "OpenAPI Document" (by the definition we have now added), and bullet 3 also does not apply since this is not a reference target.

If we make bullet 2 be only about OpenAPI documents, then we'll need to add another bullet for documents that are not OpenAPI documents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the way the document flow was broken by the comment I had missed the third bullet point. This contributed to me misunderstanding what the second point was trying to say.

In the tooling my team builds, when parsing an OpenAPI document that has been referenced, we don't use the context of the reference (context 3) to determine the type of the thing beingback to the probele referenced. If someone attempts to reference a schema in a location that expects a parameter, we give a type mismatch error. We don't attempt to parse the schema as if it was a parameter. We would only do that when pointing to a document that is not an OpenAPI Document.

But as Henry states, this is implementation dependent. Tooling that is building an OpenAPI document editing experience would always use the structural context (context 2) to determine the type of something. Whereas a a tool that is just consuming an OpenAPI document could easily choose to use context 3 for determining the type of a reference target.

Going back to the debated terminology, the phrase "OpenAPI Description" is problematic because you could argue that the parent object in context 2 is not actually part of the OpenAPI Description. However, if you say "OpenAPI Document" you a) know explicitly which parent you are talking about and b) know that tooling can parse the OpenAPI document and know the type of the parent object.

So, apologies to everyone involved but I am going to reverse my opinion from what I said on Thursday and say this needs to say OpenAPI Document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can see the argument for this. But then which of bullet do you say explains the context used for parsing/interpreting the value of "schema" in otherDoc.json?

versions/3.0.4.md Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
Comment on lines -386 to +383
| <a name="server-url"></a>url | `string` | **REQUIRED**. A URL to the target host. This URL supports Server Variables and MAY be relative, to indicate that the host location is relative to the location where the OpenAPI document is being served. Variable substitutions will be made when a variable is named in `{`braces`}`. |
| <a name="server-url"></a>url | `string` | **REQUIRED**. A URL to the target host. This URL supports Server Variables and MAY be relative, to indicate that the host location is relative to the location where the entry document of the OpenAPI Description is being served. Variable substitutions will be made when a variable is named in `{`braces`}`. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might also be document. It doesn't matter for Server Objects under the OpenAPI Object's servers field, and if there are no such Server Objects, then we're basically "pretending" that there is one with a url based on the entry document's location.

But this gets weird when you have a Path Item Object in a referenced document that defines its own servers, or that has Operation Objects that define their own servers (or a Link Object that defines server). I think those ought to be relevant to the location of the document in which they appear.

In "Resolving Implicit Connections", I hand-waved the Servers problem on the grounds that only the entry document's Paths Object is relevant. But with a shared Path Item Object under the Components Object of a components-only "syntactically complete" document, do you look up the default by current document or by entry document? That might actually need clarification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "Resolving Implicit Connections", I hand-waved the Servers problem on the grounds that only the entry document's Paths Object is relevant. But with a shared Path Item Object under the Components Object of a components-only "syntactically complete" document, do you look up the default by current document or by entry document? That might actually need clarification.

It would seem logical to me then the lookup happens on the entry document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ought to follow RFC3986 which would go:

  • RFC3986 §5.1.2 enclosing context (OAS 3.0 has always violated RFC3986 by skipping this)
  • RFC3986 §5.1.3 retrieval URL of the resource, which in this case would be the current document
  • RFC3986 §5.1.4 application-specific default, which could be anything we want

Do we state anywhere how to locate Server Objects from referenced (non-entry) documents that are OpenAPI Documents and therefore might have an OpenAPI Object (and servers field) in the document root that is not the OAD root (which is always the entry document)?

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't define "OpenAPI document" and have to replace it with either "OpenAPI Description" or "OpenAPI entry document", whichever is meant.

And I prefer not to define "OpenAPI document" because it is halfway between "OpenAPI Description" and "OpenAPI entry document" and a larger "Hamming distance" between two terms of the same specification reduces confusion.

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
@mikekistler
Copy link
Contributor Author

Folks, I had a discussion with Darrel Miller on this topic yesterday and wanted to recap some of that here.

@darrelmiller please correct me if I captured any of this incorrectly.

At the beginning of the discussion, Darrel felt that there was significance to the term "OpenAPI Document" and was surprised that it was not defined in the spec (we were looking mostly at the 3.0 spec). Darrel proposed this definition for "OpenAPI Document":

An OpenAPI document is any document that follows the syntax and semantics of the OpenAPI Object of the OpenAPI specification.

So an OAD's entry document (as we define this term in the spec) clearly must be an "OpenAPI Document" by Darrel's definition, but other documents referenced from the entry document may also be "OpenAPI Documents" by Darrel's definition. And from a tooling perspective, Darrel explained the importance of this as follows:

If a reference goes to a fragment in an 10 MB file, that fragment might be just 10 lines, but the tooling needs to read / parse most of the file to find it. For efficiency, the tooling will probably parse the whole file and keep the results in memory so that any other references into that file are handled efficiently.

So that makes sense but then begs the question of how the referenced document is parsed. Is it parsed as a JSON document, or an OpenAPI Document, or a JSON Schema document, or ??. I think Darrel was implying that if the file had "openapi" at the root, it would be parsed as an OpenAPI document.

But now we come to this question: Suppose an entry document contains this $ref:

"$ref": "./CommonTypes.json#/components/schemas/ResponseBody"

If the "CommonTypes.json" file happens to have "openapi" at the root, does that mean that it is parsed as an OpenAPI Document? If so, what happens if the version of OpenAPI indicated in the "openapi" field doesn't match the version in the entry document? What happens if the version in the "openapi" field is bogus, e.g. "42"? What happens if parts of the document other than the referenced fragment fail to parse correctly as an OpenAPI document?

But the real issue comes when we get to 3.1 and schemas can have $id fields. In that world, a $ref can contain a URI. To support this, there is an "out of band" process for tooling to "locate" schemas which are collected into a registry keyed by their "id" so that URI's can then be resolved when parsing the OAD.

So suppose in the "out of band process" there is a set of schemas loaded from an OpenAPI 3.1 Document or JSON Schema document. Parsing of those documents should follow the URI resolution rules, which means they need to resolve relative URIs according to RFC 3986 Section 5. In particular, the Base URI for a schema may come from the "Encapsulating Entity", and relative URIs within this schema are then resolved according to this Base URI.

So far, so good. The schema is entered into the registry and any URI with its "id" gets that schema. BUT, suppose an OAD contains a $ref to this schema by id and in another place it also contains a $ref to this schema by location? Now there can be a problem, because references by location don't necessarily consider the "context" of the fragment -- meaning that the base URI in the encapsulating entity might not be considered, and relative URIs in the schema may be resolved differently than when the schema was referenced by id. If the schema contains a "$id" field, then this is broken, because the same id represents two different schemas.

Maybe everyone else understood this and it is just me that is now coming to this realization.

I'll stop writing here as this is already too long, but I hope this is helpful to moving this discussion along.

@ralfhandl
Copy link
Contributor

We really should add Darrel's definition of "OpenAPI document" to the Definitions section, and then work through the PR from there.

And start in parallel on a similar PR for 3.1.1 so we can see the necessary differences simultaneously.

@mikekistler Could you please create the second PR? You seem to have dug rather deep into this topic.

@mikekistler
Copy link
Contributor Author

One new thing to add here. I am far from being an expert on JSON Schema, particularly the latest versions, so I was studying the 2020-12 draft looking for some better understanding, and found Section 7.1 Lexical and Dynamic Scope. I think this is essential reading as it describes the problem we are wrestling with and may offer a way out. I'll snip out here some of the key pieces:

Note that some keywords, such as "$schema", apply to the lexical scope of the entire schema resource ..

Other keywords may take into account the dynamic scope that exists during the evaluation of a schema, typically together with an instance document.

Lexical and dynamic scopes align until a reference keyword is encountered ... A keyword on the far side of that reference that resolves information through the dynamic scope will consider the originating side of the reference to be their dynamic parent, rather than examining the local lexically enclosing parent.

it may be possible to revisit the same lexical scope repeatedly with different dynamic scopes.

Now, it was not clear (to me anyway) whether "$ref" is a keyword that is resolved in lexical scope vs dynamic scope. But if we consider "$ref" to be resolved in dynamic scope, then I think we can treat "$id" as identifying the "lexical scope" of the schema -- which might be resolved to different things in different dynamic scopes. So there is only one schema with this id, but it resolves to different things in different dynamic scopes.

What do folks think of this idea?

@handrews
Copy link
Member

handrews commented Sep 21, 2024

@mikekistler First, I'm really glad you're getting into this, and that you and @darrelmiller had a great discussion about it. Darrel has heard all of the frustrations I encountered on this during OASComply, and we compared notes about the challenges of ipmlementing 3.1 at some point.

Maybe everyone else understood this and it is just me that is now coming to this realization.

No, I've been driving myself absolutely crazy(-er 🤪 ) trying to get people to understand this for the past several years, but it's really hard to explain until you really try to follow it all through to implement it. Congratulations on being one of the few who have now dug deep enough to get it :-) (I mean that sincerely- it's a lot of obscure detail but it's really important, and I wish I'd been able to communicate it more clearly. I have more than enough data to know that it's not easy to figure out, or else more people would have by now).

An OpenAPI document is any document that follows the syntax and semantics of the OpenAPI Object of the OpenAPI specification.

So an OAD's entry document (as we define this term in the spec) clearly must be an "OpenAPI Document" by Darrel's definition, but other documents referenced from the entry document may also be "OpenAPI Documents" by Darrel's definition.

I would definitely support this, as it defines "OpenAPI Document" the same as what I was calling a "syntactically complete" document, grasping for a way to distinguish it from fragmentary documents (that are not JSON Schema documents).

If the "CommonTypes.json" file happens to have "openapi" at the root, does that mean that it is parsed as an OpenAPI Document? If so, what happens if the version of OpenAPI indicated in the "openapi" field doesn't match the version in the entry document? What happens if the version in the "openapi" field is bogus, e.g. "42"? What happens if parts of the document other than the referenced fragment fail to parse correctly as an OpenAPI document?

I hit every single one of these problems trying to write oascomply. The only reason I haven't pushed harder on getting all of them resolved is that it was hard enough to get attention on the more obvious problems, and until you (Mike) started digging into this it just seemed like people were getting burnt out on my efforts. And so was I, tbh. Some parts of this have been answered, more on that in a bit. Most of the rest could be answered by keeping things analogous to the parts that have been answered.

But the real issue comes when we get to 3.1 and schemas can have $id fields. In that world, a $ref can contain a URI. To support this, there is an "out of band" process for tooling to "locate" schemas which are collected into a registry keyed by their "id" so that URI's can then be resolved when parsing the OAD.

This is all 100% completely addressed by 3.1.1 §4.3.1 "Parsing Documents." This is actually a thoroughly understood and solved problem in JSON Schema, and you can find many implementations of it in the wild.

It does not necessarily have to be out-of-band: 3.1.1 states that if it looks like an OpenAPI Document or a JSON Schema document, you can treat it like one. It also allows for parsing a small-d document based on a $ref to its root, as that provides the type of the whole docment (let's call it an OpenAPI Fragment, for example a document consisting of a Path Item Object). 3.1.1 explicitly notes that tools can be configured to know where a URI is located- this is, again, common in JSON Schema and has been for a long time (technically since well before draft-04, but I don't think we got the explanation clear enough until draft-07-ish. It's much more explicit in 2019-09 and 2020-12 where we added a whole section on loading schemas.

So far, so good. The schema is entered into the registry and any URI with its "id" gets that schema. BUT, suppose an OAD contains a $ref to this schema by id and in another place it also contains a $ref to this schema by location?

This is also all well-understood in JSON Schema-land, and is why both JSON Schema and OAS 3.1 require full-document parsing. I'll come back to it after addressing the next few statements:

Now there can be a problem, because references by location don't necessarily consider the "context" of the fragment

This is not true. There is no difference between resolving a fragment in a document based on identity vs location, it works exactly the same way. RFC3986, in fact, only cares about identity. Treating a URI as a URL is just a handy default way to locate an identified document. Whether a given URI can reasonably be treated as a URL is up to the application involved.

meaning that the base URI in the encapsulating entity might not be considered, and relative URIs in the schema may be resolved differently than when the schema was referenced by id. If the schema contains a "$id" field, then this is broken, because the same id represents two different schemas.

It's not broken, it works exactly as intended. This whole "ignore the format of the document" is a thing I've never seen anywhere else that has no relationship to how URIs, media types, and resources are supposed to work, which is for their behavior to be keyed by media type (in the absence of a media type, something like a file extension or content sniffing can be used as a heursitic).

AFAICT, this "ignore the context" thing is completely unique to OAS. Although I mght well be wrong, of course. I'm pretty sure it is not something that comes from JSON Reference which itself does not appear to understand how URIs, media types, and fragments are supposed to work (because it mandates behavior counter to RFC9601 regarding JSON pointers as URI fragments).


Regarding addressing a JSON Schema document (for simplicity- OAS doesn't change things much as so far it doesn't have an $id-equivalent, but now you are starting to see why it needs one in 3.1), there are a few cases:

  • Reference the document by location, and it doesn't contain any $id, so this just works as you'd think
  • Ref by location, $id in the root: In this case, both the location (retrieval URL) and $id are valid URIs for the schema. It's a little unclear if the retrieval URL should be retained for future reference or discarded ,so I'd always reference by identity- the whole point is to make your $ref values stable even if the target moves around. Repeatedly ref-ing by location might lose the benefits of the cache.
  • Ref by location, $id in a subschema:
    • As long as your JSON pointer doesn't cross into the $id subschema, you are fine
    • If your JSON Pointer crosses into a subschema of the subschema containing $id, the behavior is implementation-defined. Don't do this. We were forced to make this implementation-defined because some people didn't want to have to check for an error. I wanted it to be an error
    • If your JSON Pointer points to the subschema containing the $Id, this is essentially the same as the retrieval URL vs identity case- the spec doesn't nail it down, but I'd expect it to work (the spec only says that crossing the $id is implemenation-defined). But still, I'd always reference by identity.

(that's enough for this comment, I'll probably make a few follow-ups)

@handrews
Copy link
Member

@mikekistler

Now, it was not clear (to me anyway) whether "$ref" is a keyword that is resolved in lexical scope vs dynamic scope. But if we consider "$ref" to be resolved in dynamic scope, then I think we can treat "$id" as identifying the "lexical scope" of the schema -- which might be resolved to different things in different dynamic scopes. So there is only one schema with this id, but it resolves to different things in different dynamic scopes.

What do folks think of this idea?

The way $id, $anchor, and $ref (any of the $refs) work is 100% governed by RFC 3986 and there is no wiggle room at all in that. URIs are Universal: there aren't any scopes. There are, however, base URIs, which are well-specified (although the OAS kind of munges it by skipping some of the steps- I tried to fix that by not over-specifying it in 3.0.4 and 3.1.1). The only thing a media type can really impact is whether it sets the base URI within the document contents. The other three possible base URI sources are not under the media type's control.

The whole dynamic scope thing has to do with how $dynamicAnchor and $dynamicRef work. TL;DR: $dynamicRef searches the resources (not objects or documents) in the dynamic scope for the most distant (first-parsed that is still in scope) $dynamicAnchor, and sets the base URI of the $dynamicRef value to the absolute URI of that resource. If no distant $dynamicAnchor is found, then the base URI doesn't change and $dynamicRef behaves exactly like $ref.

I've never been entirely satisfied with this mechanism. It basically inserts some steps before handing off to normal RFC3986 behavior.

The "scope" language has more to do with how certain JSON Schema keywords communicate up and down the dynamic scope, which is how unevaluatedProperties works.

@handrews
Copy link
Member

I guess the overarching theme here is: All of the URI-based parts are well-defined in 3.1, and are now (hopefully) actually well-explained in 3.1.1. 3.0 is both easier because it is location-only, and harder because location-only isn't very practical in a lot of environments, which is why some tools already allow you to load a document from an alternate location.

@mikekistler
Copy link
Contributor Author

There are a number of things I'm struggling with here. You say

The way $id, $anchor, and $ref (any of the $refs) work is 100% governed by RFC 3986 and there is no wiggle room at all in that. URIs are Universal: there aren't any scopes.

and later

The whole dynamic scope thing has to do with how $dynamicAnchor and $dynamicRef work.

But the 2020-12 JSON Schema spec says:

The value of the "$dynamicRef" property MUST be a string which is a URI-Reference.

So if $dynamicRef is a URI-Reference, why doesn't RFC 3986 disallow scopes for this as well?

Also, when you say "there is no wiggle room" ... I'm guessing you are referring to Section 5 of RFC 3986 describing the rules for reference resolution, and maybe more specifically to 5.1.2 Base URI from the Encapsulating Entity:

If no base URI is embedded, the base URI is defined by the representation's retrieval context. For a document that is enclosed within another entity, such as a message or archive, the retrieval context is that entity. Thus, the default base URI of a representation is the base URI of the entity in which the representation is encapsulated.

What confuses me about this language is that it seems to imply that "the enclosing entity" is unique -- that "the entity" is well defined, but isn't it possible, at least for schemas, that a schema might be nested arbitrarily deep in other schemas and as a result there are arbitrarily many possible "enclosing entities". The RFC seems not to consider/allow this.

This is all new territory for me, so I appreciate any help in clearing up my confusion on these things.

@handrews
Copy link
Member

@mikekistler

So if $dynamicRef is a URI-Reference, why doesn't RFC 3986 disallow scopes for this as well?

I'll direct you to my statement:

've never been entirely satisfied with [the $dynamicRef] mechanism. It basically inserts some steps before handing off to normal RFC3986 behavior.

By "no wiggle room" for $ref I mean that it's just defined as a URI-reference with no special rules. With $dynamicRef, it's defined as a two-step process: Do this weird thing to find the base URI and then follow RFC3986. In retrospect, I don't think this was a good idea. I should have come up with a different syntax that more clearly separated the non-RFC3986 and RFC3986 behaviors.

So, $dynamicRef was defined as a two-step process that inserted non-RFC3986 behavior (wiggle room) in the first step. None of the $refs are defined in that way.

Also, it's not clear to me what you're' trying to solve with "scopes", as the $refs, $id, and $anchor all already work correctly as far as I'm concerned. Is it the nested $id aspect? I'm guessing yes because you ask:

What confuses me about this language is that it seems to imply that "the enclosing entity" is unique -- that "the entity" is well defined, but isn't it possible, at least for schemas, that a schema might be nested arbitrarily deep in other schemas and as a result there are arbitrarily many possible "enclosing entities". The RFC seems not to consider/allow this.

I can see how this looks weird, but it's actually not. The enclosing entity is unique... for each individual URI-reference. Each schema object that contains a $id is the root of a distinct primary resource (in RFC3986 terms, a resource identified by an absolute-URI, without a fragment). Multiple resources can be nested in a document this way. In this sense, $id does behave somewhat like a scope ,as it sets the base URI for all URI-references in that resource.

A schema object A with an $id that is a subscehma of another object B that also has a $id has B as its enclosing enitity. If B also has a parent schema object C with $id, then C is B's enclosing entity. If there are no more parents with $id but this schema is nested in an OpenAPI Document, then the OpenAPI Document is the enclosing entity. And there might be another beyond that, such as if the OpenAPI Document is one part in a multipart/related that sets the base URI using Content-Location.

Alternatively, you could consider the URI-with-JSON Pointer-fragment in the enclosing schema that points to the enclosed schema as the enclosed schema's retrieval URL, and treat the $id nesting under RFC3986 §51.3 instead of §5.1.2, I've never been sure which is more correct. Either way, it also explains why relative $ids resolve against their parent $id or, failing that, the retrieval URL.

This is all new territory for me, so I appreciate any help in clearing up my confusion on these things.

It's exceptionally confusing territory, especially $dynamicRef. In my defense, at least $dynamicRef works as it was intended, unlike its draft 2019-09 predecessor, $recursiveRef. But it still should not have been written to take a URI-reference.

@handrews
Copy link
Member

handrews commented Sep 22, 2024

@mikekistler you might find the "JSF Part 2: A Processing Model" presentation from my abandoned effort to turn JSON Schema into a truly extensible keyword framework to be helpful. It talks about how to process JSON documents and resources and gives examples of how regular and dynamic references work.

@mikekistler
Copy link
Contributor Author

The picture is getting clearer, but I'm trying to really make sure I understand.

It seems the the crux of this whole matter comes down to RFC 3986 Section 5.1.2:


If no base URI is embedded, the base URI is defined by the representation's retrieval context. For a document that is enclosed within another entity, such as a message or archive, the retrieval context is that entity. Thus, the default base URI of a representation is the base URI of the entity in which the representation is encapsulated.

so I want to make sure I fully understand this. But there are a couple things I find curious

  • The term "entity" is not defined in RFC 3986, so I guess it could mean just about anything. It only occurs 11 times in the RFC, most of which are in section 5.1. Examples of an entity in the RFC are a message or archive. It is particularly curious that the term "entity" is used rather than "resource", which is already a very general term

    the term "resource" is used in a general sense for whatever might be identified by a URI.

    The RFC explicitly cites message and archive as two possible entity types. Does this mean that tools that process URIs must understand and properly process resources encapsulated in messages and archives? What other types of entities must be properly handled?


  • The term "encapsulating" is also not defined in RFC 3986, and appears only 8 times, most of which are in Section 5.1. And "enclosed" occurs exactly once in the RFC, in the paragraph quoted above. It is particularly interesting that the definition of "Fragment" in Section 3.5 uses "primary" and "secondary" resource and not "encapsulating" and "encapsulated" resource.

Maybe I am being too pedantic about these details, but they bother me. In particular, if someone else were to point these things out to me, I don't know how I could explain them.

@ralfhandl
Copy link
Contributor

ralfhandl commented Sep 23, 2024

  • The term "entity" is not defined in RFC 3986

It was defined by HTTP/1.0 (RFC1945) and still present in HTTP/1.1 (RFC2616). Today it only survives in the term "entity tag (ETag)", which is easier to understand if one knows what an "entity" used to be 😎.

The difference to "message body" is rather slim:

The entity-body is obtained from the message-body by decoding any Transfer-Encoding that might have been applied to ensure safe and proper transfer of the message.

RFC9110 uses the term "content" instead of "entity body".

@mikekistler
Copy link
Contributor Author

Very curious. If the term "entity" was intentionally used over "resource", then it seems that it was intended to mean the entirety of the body retrieved rather than some subset of it that might be a "resource". Do you read it that way?

@handrews
Copy link
Member

@mikekistler

the term "resource" is used in a general sense for whatever might be identified by a URI.

The RFC explicitly cites message and archive as two possible entity types.

Do email messages have URIs? Not in general, to my knowledge. Therefore, they are not "resources", but are still something that has to be processed and handled.

Very curious. If the term "entity" was intentionally used over "resource", then it seems that it was intended to mean the entirety of the body retrieved rather than some subset of it that might be a "resource". Do you read it that way?

I think it just means you don't need the encapsulating/enclosing thing to have its own URI, just that it needs to provide context in some sense. For example, a multipart/related HTML page archive, a format intended to be used for an email attachment, uses a per-part Content-Location header to set the base URI (this format could easily be used as a bundling format for OADs, at least if people found multipart to be easy to deal with, which I rather suspect most people don't).

It is particularly interesting that the definition of "Fragment" in Section 3.5 uses "primary" and "secondary" resource and not "encapsulating" and "encapsulated" resource.

Because only primary resources have absolute-URIs and therefore define base URIs. A fragment/secondary resource cannot have a distinct base URI from its primary resource. Fragments, by definition, use the primary resource's URI as the base URI.

@mikekistler
Copy link
Contributor Author

I think it just means you don't need the encapsulating/enclosing thing to have its own URI

Hmmm. Well the RFC seems to suggest that it does have a base URI, since it says

Thus, the default base URI of a representation is the base URI of the entity in which the representation is encapsulated.

Does it make sense for something to have a base URI without having a URI? How does that happen?

@handrews
Copy link
Member

Does it make sense for something to have a base URI without having a URI? How does that happen?

This isn't about the encapsulating entity having a base URI, it's about one providing a base URI. I forgot to link the RFC 2557 for the multipart/related example, where the entity need not have a URI but provides a URI (and base URI) to each part through the Content-Location header.


I realized a few more things:

There is language in 3.1.1 using "embedding" / "embedded" that should probably use "encapsulating" / "encapsulated"

I mentioned being unsure about whether to use RFC3986 §5.1.2 or 5.1.3 in the nested encapsulated JSON Schema case - I think it ends up being the same. They are both about determining the base URI from the retrieval context - 5.1.2 works when there is no direct retrieval URI for the encapsulated resource, but one is provided by the encapsulating entity (e.g. the Content-Location header per-multipart/related part in an email attachment), while 5.1.3 works if there is a retrieval URI.

Since encapsulated JSON Schema resource (whether encapsulated in JSON Schema or OAS) can be addressed by the encapsulating resource's URI plus a JSON Pointer fragment, you can treat that directly as a retrieval URI under 5.1.3. Or you can look at the encapsulating resource more abstractly and just use its base URI under 5.1.2 - it's a philosophical difference without practical impact because either way, you are using the encapsulating resource's URI/base URI.

But if your encapsulating entity is a multipart/related email attachment or freestanding resource, you don't use its URI (if it has one), you just use the per-part Content-Location header. So in that case, 5.1.2 vs 5.1.3 has a practical difference (and only 5.1.2 works in the email attachment context since there isn't a retrieval URI for 5.1.3 in that case).

Copy link
Member

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting very, very close! Thanks for all of the continued hard work on this @mikekistler and everyone else involved!

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Outdated Show resolved Hide resolved
* As a syntactically complete OpenAPI Description document
* As the Object type implied by its parent Object within the document
* The root object of the entry document is interpreted as an OpenAPI Object
* As the Object type implied by its parent Object within the OpenAPI Description
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still "OpenAPI Document" for the same reasons I stated previously- it is literally about the parent-child JSON/YAML relationship within the Document, and never about any other possible relationship within the Description. That is the entire purpose of this statement, and without that, the distinction between this bullet point and the next doesn't make any sense.

versions/3.0.4.md Outdated Show resolved Hide resolved
versions/3.0.4.md Show resolved Hide resolved
versions/3.0.4.md Show resolved Hide resolved
Comment on lines -386 to +383
| <a name="server-url"></a>url | `string` | **REQUIRED**. A URL to the target host. This URL supports Server Variables and MAY be relative, to indicate that the host location is relative to the location where the OpenAPI document is being served. Variable substitutions will be made when a variable is named in `{`braces`}`. |
| <a name="server-url"></a>url | `string` | **REQUIRED**. A URL to the target host. This URL supports Server Variables and MAY be relative, to indicate that the host location is relative to the location where the entry document of the OpenAPI Description is being served. Variable substitutions will be made when a variable is named in `{`braces`}`. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ought to follow RFC3986 which would go:

  • RFC3986 §5.1.2 enclosing context (OAS 3.0 has always violated RFC3986 by skipping this)
  • RFC3986 §5.1.3 retrieval URL of the resource, which in this case would be the current document
  • RFC3986 §5.1.4 application-specific default, which could be anything we want

Do we state anywhere how to locate Server Objects from referenced (non-entry) documents that are OpenAPI Documents and therefore might have an OpenAPI Object (and servers field) in the document root that is not the OAD root (which is always the entry document)?

versions/3.0.4.md Outdated Show resolved Hide resolved
@@ -2302,7 +2303,7 @@ For computing links and providing instructions to execute them, a [runtime expre

| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="link-operation-ref"></a>operationRef | `string` | A relative or absolute URI reference to an OAS operation. This field is mutually exclusive of the `operationId` field, and MUST point to an [Operation Object](#operation-object). Relative `operationRef` values MAY be used to locate an existing [Operation Object](#operation-object) in the OpenAPI description. |
| <a name="link-operation-ref"></a>operationRef | `string` | A relative or absolute URI reference to an OAS operation. This field is mutually exclusive of the `operationId` field, and MUST point to an [Operation Object](#operation-object). Relative `operationRef` values MAY be used to locate an existing [Operation Object](#operation-object) in the OpenAPI Description. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that operationRef can only point to a resource/document that is already part of the OAD due to the action of another keyword? This is relevant to the text early in the spec that states what keywords are allowed to connect parts into an OAD. (but see next comment in the Reference Object)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't read it that way. Though I do find the terminology "A relative or absolute URI reference" a bit strange. I think that URI references are either relative refs or URIs, and URIs may be absolute or not, but I don't think "an absolute URI reference" is a thing. This is the only place this phrase occurs in the spec. I think "URI reference" is clearer and more accurate, so I'll make that change. Also open to additional changes if you think they are needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"URI reference" in RFC3986 is the most generic term. Anything vaguely URI-ish (relative, absolute, full) is technically A URI reference (often written "URI-reference").

But you are right that this needs to be changed for a different reason: https://example.com/foo#/components/pathItems/bar/get is a URI-reference that is neither relative (because it has a scheme) nor absolute (because it has a fragment).

@@ -2592,7 +2593,7 @@ description: Pets operations

#### Reference Object

A simple object to allow referencing other components in the OpenAPI document, internally and externally.
A simple object to allow referencing other components in the OpenAPI Description, internally and externally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests that "in the OAD" means "in the OAD, including documents that are only part of the OAD because of this Reference Object", which would mean that operationRef above, with similar wording, can also pull in otherwise-unconnected documents. Is this how this wording is intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "otherwise-unconnected documents" -- I think $ref is a thing that connects documents. But perhaps to directly answer your question -- I think it is intended for $ref to be able to reference any document, whether or not it was somehow "connected" through some other means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, $ref has to be able to do that. I'm pointing out that I find that the language suggests to me that the document has to somehow already be part of the OAD, but the only way we define to connect things is $ref/operationRef/mapping. For $ref it's (to us) self-evident that there's no other way, but for operationRef and mapping this wording could be read to say that you can only point operationRef and mapping to a document that has already been $ref'd. So this comment is really more about operationRef and mapping.

@@ -123,7 +121,7 @@ In order to preserve the ability to round-trip between YAML and JSON formats, YA

### OpenAPI Description Structure

An OpenAPI Description (OAD) MAY be made up of a single JSON or YAML document or be divided into multiple, connected parts at the discretion of the author. In the latter case, [Reference Object](#reference-object) and [Path Item Object](#path-item-object) `$ref` keywords, as well as the [Link Object](#link-object) `operationRef` keyword, are used to identify the referenced elements.
An OpenAPI Description (OAD) MAY be made up of a single JSON or YAML document or be divided into multiple, connected parts at the discretion of the author. In the latter case, [Reference Object](#reference-object) and [Path Item Object](#path-item-object) `$ref` keywords, as well as the [Link Object](#link-object) `operationRef` keyword, and the URI form of the [Discriminator Object](#discriminator-object) mapping keyword, are used to identify the referenced elements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes! You need the `backticks` around mapping but otherwise good.

@@ -2402,7 +2400,7 @@ links:
username: $response.body#/username
```

or an absolute `operationRef`:
or a URI `operationRef`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This correct although the whole thing feels awkward. But not enough to change it- this is good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification requests to clarify, but not change, part of the spec editorial Wording and stylistic issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants