Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formalize how to express null value in xml #3959

Open
ahmednfwela opened this issue Jul 16, 2024 · 1 comment
Open

Formalize how to express null value in xml #3959

ahmednfwela opened this issue Jul 16, 2024 · 1 comment
Labels
media and encoding Issues regarding media type support and how to encode data (outside of query/path params) xml
Milestone

Comments

@ahmednfwela
Copy link

Since XML has no concept of null, how can we handle validating a null value (both as an attribute and as an element) ?

consider the following 3.0 schema :

person:
  type: object
  required:
    - name
    - attrName
  properties:
    name:
      type: string
      nullable: true
    attrName:
      type: string
      nullable: true
      xml:
        attribute: true

notice how required here prevents us from removing the attribute/element

I have thought about this, and here are some of the approaches I came up with:

For elements

Approach 1: self closing tags

<person>
  <name />
</person>

Pros: Makes sense to whoever reads it
Cons: Nothing i can think of, but maybe some xml parsers can consider a self-closing tag equivalent to empty string and don't distinguish between them, which means they don't survive round tripping:
e.g.
<name /> gets represented on the way back to: <name></name>

Approach 2: empty string

<person>
  <name></name>
</person>

Cons: if the property is of type string, it's not possible to distinguish between a non-null empty string and a null.
Workaround: force strings to be wrapped around double quotes, e.g.

  • this is null:
<name></name>
  • this is empty string:
<name>""</name>
  • this is valid string:
<name>"hello"</name>
  • this is non valid string
<name>hello</name>

Ofc this workaround is very problematic and not good since most parsers consider "" A valid 2 character string.

Approach 3: special marker attribute

<name xsi:nil="true"></name>
<name xsi:nil="true"/>

Pros: Can represent nulls consistently without having to check the contents of the element, this is also how xml schema does it.
Cons: Size overhead of having to use xsi:nil="true" everywhere null is used.

For attributes

Approach 1: empty string

<person attrName="" />

Approach 2: disallow nullable attributes altogether

make it that xml.attribute: true and nullable: true are mutually exclusive

@ralfhandl ralfhandl added the xml label Jul 17, 2024
@ralfhandl
Copy link
Contributor

@ahmednfwela Thanks for reporting this, and for the detailed research and references.

Preliminary analysis for elements

XML 1.0, section 3.1 "Start-Tags, End-Tags, and Empty-Element Tags" states that the element forms <name></name> and <name/> are equivalent and represent an element with no content, aka an empty element, so approaches 1 and 2 are equivalent.

The meaning of "empty" seems to depend on context/implementation; for string-valued elements "empty" means the empty string.

So approach 3 (xsi:nil="true") seems to be the way forward.

Preliminary analysis for attributes

XML 1.0, section 3.3.3 "Attribute-Value Normalization" describes an algorithm that MUST be applied before the value of an attribute is passed to the application or checked for validity. This algorithm begins with a normalized value consisting of the empty string, then appends to it. Thus attribute values are always strings, potentially the empty string, and never null.

So approach 2 (disallow nullable attributes) seems to be the way forward.

@ralfhandl ralfhandl added this to the v3.2.0 milestone Jul 25, 2024
@handrews handrews added the media and encoding Issues regarding media type support and how to encode data (outside of query/path params) label Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
media and encoding Issues regarding media type support and how to encode data (outside of query/path params) xml
Projects
None yet
Development

No branches or pull requests

3 participants