Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to generate models for meta jsonschema #447

Closed
Skeen opened this issue Jun 11, 2021 · 10 comments · Fixed by #979
Closed

Unable to generate models for meta jsonschema #447

Skeen opened this issue Jun 11, 2021 · 10 comments · Fixed by #979

Comments

@Skeen
Copy link

Skeen commented Jun 11, 2021

Describe the bug
I expected to be able to generate Pydanic models for validating JSONSchemas themselves.

To Reproduce
Fetch the json-schema meta-schema draft 7:

wget http://json-schema.org/draft-07/schema

Run datamodel-codegen:

datamodel-codegen --input schema --input-file-type jsonschema

Expected behavior
I expected data-models able to parse JSONSchemas to be generated.

Version:

  • OS: [e.g. iOS] Ubuntu 20.04.2 LTS
  • Python version: Python 3.8.5
  • datamodel-code-generator version: [e.g. 22] 0.11.7
@koxudaxi
Copy link
Owner

@Skeen
Thank you for creating this issue.
I have confirmed the error.
But, I don't understand why the error occurs.
I will check it.

@koxudaxi
Copy link
Owner

@Skeen

I ran datamodel-code-generator for the schema
output:

pydantic.error_wrappers.ValidationError: 6 validation errors for JsonSchemaObject
properties -> default
  value is not a valid dict (type=type_error.dict)
properties -> examples -> items
  value is not a valid list (type=type_error.list)
properties -> examples -> items
  value is not a valid dict (type=type_error.dict)
properties -> const
  value is not a valid dict (type=type_error.dict)
properties -> enum -> items
  value is not a valid list (type=type_error.list)
properties -> enum -> items
  value is not a valid dict (type=type_error.dict)

I check the detail of the schema.

        "examples": {
            "type": "array",
            "items": true
        },

I don't understand why items have true. I expect a list of the object.

What do you think about it?

@Skeen
Copy link
Author

Skeen commented Aug 21, 2021

        "examples": {
            "type": "array",
            "items": true
        },

I don't understand why items have true. I expect a list of the object.

What do you think about it?

This is unexpected to me too, I'd have imagined either a type definition or a reference within an object, not just true.

@xkortex
Copy link

xkortex commented Jun 25, 2022

I was toying around with this issue and it seems like the main problem is that there are several fields (e.g. additionalItems, additionalProperties, items, maybe some others) which are set to true in the jsonschema draft, but the JsonSchemaObject isn't really able to take this into account. I think maybe just allowing for these fields to be Union[whatever, bool] would get through snag.

Would be nice if it could self-bootstrap the jsonschema spec itself :)

Was able to make a tiny bit of progress by amending items to allow for bool:

    items: Union[List['JsonSchemaObject'], 'JsonSchemaObject', bool, None]

This required messing with parse_list_item and I am not quite sure what it should return in that case, but I just returned [] if target_items is bool.

That just leaves the fields in draft-07.json "default": true and "const": true. That'll take a bit more digging.

...

Ok I messed around with it for a while and found these locations which seem to be amenable. Obviously I'd want to clean these up but I think I have at least found the pain points. I also need to dig into the spec to determine what the intent of these "foo": true fields is. But this looks tractable!

https://github.com/xkortex/datamodel-code-generator/tree/xkortex/447/handle_true_attributes

Also I found a bug in the way the NonNegativeIntegerDefault0 is being generated:

class NonNegativeInteger(BaseModel):
    __root__: conint(ge=0)


class NonNegativeIntegerDefault0(BaseModel):
    pass

should be more like this I believe:

class NonNegativeIntegerDefault0(NonNegativeInteger):
    __root__: conint(ge=0) = 0

@xkortex
Copy link

xkortex commented Jun 25, 2022

Possibly relevant: OAI/OpenAPI-Specification#668 (comment)

So I think we might be able to substitute in the empty object in places where we encounter these true fields.

Also I was thinking about the whole self-hosting thing a bit more. In theory, since the json-schema describes itself, we ought to be able to compile the json-schema draft (e.g. draft-7.json) to python, then use that to parse that very same draft. The output of that parse should compile back to python. This could be a very useful way to test the veracity of the parser and generator.

In fact, that "self-compiling" python code could act as the main parser in the library. It's much, much shorter, so I wonder why there is so much manual logic in the current JsonSchemaObject. Maybe that's the bootstrapping overhead? :)

@rtbs-dev
Copy link

rtbs-dev commented Jul 20, 2022

@xkortex I was checking out that thread and it looks like, yep, the jsonschema spec treats item: true and item: {} identically. That's super interesting, and would solve my use case if taken into account.

I also think your idea about the self-describing schema is excellent; maybe worth a separate issue?


update: this seems to clear things up per the latest openapi docs:

[4.3.2](https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-02#section-4.3.2).  Boolean JSON Schemas

   The boolean schema values "true" and "false" are trivial schemas that
   always produce themselves as assertions results, regardless of the
   instance value.  They never produce annotation results.

   These boolean schemas exist to clarify schema author intent and
   facilitate schema processing optimizations.  They behave identically
   to the following schema objects (where "not" is part of the subschema
   application vocabulary defined in this document).

   true:  Always passes validation, as if the empty schema {}

   false:  Always fails validation, as if the schema { "not": {} }

   While the empty schema object is unambiguous, there are many possible
   equivalents to the "false" schema.  Using the boolean values ensures
   that the intent is clear to both human readers and implementations.

So... a "true" just means there's a subschema that can be empty, while a "false" is a subschema that cannot be empty?

@rtbs-dev
Copy link

rtbs-dev commented Aug 3, 2022

Also I believe this would solve #696

@sebastroy
Copy link

@Skeen

I ran datamodel-code-generator for the schema output:

pydantic.error_wrappers.ValidationError: 6 validation errors for JsonSchemaObject
properties -> default
  value is not a valid dict (type=type_error.dict)
properties -> examples -> items
  value is not a valid list (type=type_error.list)
properties -> examples -> items
  value is not a valid dict (type=type_error.dict)
properties -> const
  value is not a valid dict (type=type_error.dict)
properties -> enum -> items
  value is not a valid list (type=type_error.list)
properties -> enum -> items
  value is not a valid dict (type=type_error.dict)

I check the detail of the schema.

        "examples": {
            "type": "array",
            "items": true
        },

I don't understand why items have true. I expect a list of the object.

What do you think about it?

My understanding of the v7 specification is that array with items "true" are tuples, not lists where additional items are accepted (would false otherwise) see https://json-schema.org/understanding-json-schema/reference/array.html#items under the tuple section.

@koxudaxi
Copy link
Owner

koxudaxi commented Jan 4, 2023

I'm sorry for my too-late reply.
I have released the PR as 0.15.0
Thank you very much!!

@sebastroy
Copy link

Thank you for all the good work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants