Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive #29

Merged
merged 7 commits into from
Jul 9, 2024
Merged

Recursive #29

merged 7 commits into from
Jul 9, 2024

Conversation

brettz9
Copy link
Contributor

@brettz9 brettz9 commented Jun 21, 2024

Recursive schemas via JSON References

feat: support recursive schemas

Closes #13.

Note that there is one (hopefully minor) API risk in that in order to modify schemas in place, I insert a z.lazy() as a placeholder and later modify its _def.getter internal function to the desired getter. (This was necessary because I had to wait to iterate the whole structure so that each JSON path could be mapped and the JSON reference successfully resolved.)

@brettz9 brettz9 force-pushed the recursive branch 17 times, most recently from 29bccee to 4a4daad Compare June 21, 2024 21:20
@brettz9 brettz9 marked this pull request as ready for review June 21, 2024 21:22
@brettz9 brettz9 force-pushed the recursive branch 2 times, most recently from 54264b4 to d3becc9 Compare June 23, 2024 07:52
@brettz9 brettz9 mentioned this pull request Jun 23, 2024
@brettz9 brettz9 force-pushed the recursive branch 8 times, most recently from 6bb6316 to fbfa1f9 Compare June 27, 2024 00:58
@brettz9
Copy link
Contributor Author

brettz9 commented Jul 2, 2024

Ok, I've taken a lesson from your code (the zerialize() part) and the JSON pointer substitute, and the changes now lead to this PR passing all prior tests as well as a new test based on your failing case. Care to review, @jadedevin13 ?

@brettz9 I tested it and it's not showing any bugs. Thanks.

Great, thanks. But would you mind checking once more? I wrap references now into a union if there are extra properties like isOptional present (since JSON references shouldn't have any other properties besides $ref on them).

Just maybe update the docs?

I did add a section to the README on use of JSON References and removed the old to-do. I didn't add a full "how to use" section though as the README doesn't really contain such a section on all of its properties.

@brettz9
Copy link
Contributor Author

brettz9 commented Jul 3, 2024

Just a heads up that after this Friday, I may be out of Internet contact for a few days (moving and in a foreign country).

@jadedevin13
Copy link

Hmm...

I'm not sure if in my implementation this is the case or this was because of your recent changes but....

If you add the following in the last test scenario

  expect(mainSchema.shape.profileContact._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodOptional)
  expect(mainSchema.shape.profileContact._def.innerType._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodObject)
  expect(dezer._def.shape().profileContact._def.getter()._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodObject)

Why is the profileContact no longer optional? and why did it become lazy? I can see that there's isOptional in the zerialize but it's lost on the dezerialize even with the json pointer.

          "profileContact": {
            "type": "union",
            "options": [
              {
                "$ref": "#/properties/id"
              }
            ],
            "isOptional": true,
            "description": "{\"json\":{\"type\":\"string\"}}"
          },

And shouldn't it be directly just be a pointer like

          "profileContact": {
            "type": {
                  "$ref": "#/properties/id"
             },
            "isOptional": true,
            "description": "{\"json\":{\"type\":\"string\"}}"
          },

@brettz9 brettz9 force-pushed the recursive branch 3 times, most recently from 02e04e3 to d13582a Compare July 4, 2024 03:06
@brettz9
Copy link
Contributor Author

brettz9 commented Jul 4, 2024

expect(mainSchema.shape.profileContact._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodOptional)

expect(mainSchema.shape.profileContact._def.innerType._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodObject)
expect(dezer._def.shape().profileContact._def.getter()._def.typeName).toEqual(ZodFirstPartyTypeKind.ZodObject)


Why is the profileContact no longer optional? 

I had a bug in my reserialization test, so it wasn't showing this problem, but the test is working now. If you run the tests on this branch, you'll see the reserialized structure. It is mostly ok (with some pointers referencing a different but still equally valid property because of how the serialization may go in a different order). However, at the end of the diffs, there is this:

image

profileContact is pointing to test (which could be ok if test were ok since they should both be the same as id) but test only points to id in its properties, not to the whole id object. If we could fix this problem in my code, I think we'll be ok.

and why did it become lazy?

I used lazy as a placeholder. While I copied your zerialize code, I had recursion problems in our tests when using your dezerialize code (and also had problems when using my version of dezerialize which copied your version but was closer to passing our tests because it kept certain features yours was missing: see my recursive-dezer-alternative branch). My normal recursive branch doesn't have this problem (even though it does have the problem you identified and that is shown by the diff above).

I can see that there's isOptional in the zerialize but it's lost on the dezerialize even with the json pointer.

Yes, see above.

          "profileContact": {
            "type": "union",
            "options": [
              {
                "$ref": "#/properties/id"
              }
            ],
            "isOptional": true,
            "description": "{\"json\":{\"type\":\"string\"}}"
          },

And shouldn't it be directly just be a pointer like

          "profileContact": {
            "type": {
                  "$ref": "#/properties/id"
             },
            "isOptional": true,
            "description": "{\"json\":{\"type\":\"string\"}}"
          },

No, because that would put the contents of id inside of type rather than replacing it. JSON references entirely replace the object they are on when dereferenced (and only that object).

We could avoid the need for a single-item union by ignoring the JSON Reference spec, but that would cause problems if people used an external library to work with something like:

{
  $ref: 'https://example.com/#myProperty',
  isOptional: true
}

...because the tool would replace the whole object.

We could use our own property like ref without the $, but then Zodex JSON files wouldn't work with things like VSC ability to make links out of JSON references.

Theoretically, we should be able to get the dezerialize code to properly reference the whole id object above, and then things will be ok (as that is indeed showing, as expected, as optional). I will see if I can fix this on my own, but it would help if you could also take a look at this recursive branch to see why it is failing here.

If you really think your dezerialize should work better, please at least use my version of your code that is closer to passing the tests, recursive-dezer-alternative, and see why it is failing with the tests.

Really appreciate the assistance, thanks!

@brettz9
Copy link
Contributor Author

brettz9 commented Jul 4, 2024

And if you want to test my recursive branch, you might want to instead try my meta-schema-testing branch because that has other tests which give full coverage.

@brettz9
Copy link
Contributor Author

brettz9 commented Jul 4, 2024

@jadedevin13 : Nevermind, I think I've got a fix. Plan to report back as I am able.

@brettz9 brettz9 force-pushed the recursive branch 2 times, most recently from cce0959 to 05bc001 Compare July 4, 2024 09:27
@brettz9
Copy link
Contributor Author

brettz9 commented Jul 4, 2024

Ok, this should be ready for review again, @jadedevin13 . I'm not sure why it shows the tests as failing as they are not failing locally. Fixed

I disabled the reserialization check because it will add some unnecessary single-item unions--not sure that that can be avoided, and it still performs the same validation.

@jadedevin13
Copy link

jadedevin13 commented Jul 4, 2024

Sorry my bad, I was thinking more of something like this.

  "profileContact": {
    "type": "object",
    "properties": {
      "$ref": "#/properties/id/properties"
    },
    "isOptional": true,
    "description": "{\"json\":{\"type\":\"string\"}}"
  }

i think we should've added a type of lazy in zerializers to aid this and those that are lazy are the only ones who'll have refs

  ZodLazy: (def, opts) => {
    const getter = def.getter();
    return s(getter, opts, getter.isOptional() || getter.isNullable());
  },

@brettz9
Copy link
Contributor Author

brettz9 commented Jul 5, 2024

Sorry my bad, I was thinking more of something like this.

  "profileContact": {
    "type": "object",
    "properties": {
      "$ref": "#/properties/id/properties"
    },
    "isOptional": true,
    "description": "{\"json\":{\"type\":\"string\"}}"
  }

That would be convenient to allow, but when dezerializing, I don't know what to place for the $ref. I don't think one can create some Zod type to go there.

i think we should've added a type of lazy in zerializers to aid this and those that are lazy are the only ones who'll have refs

  ZodLazy: (def, opts) => {
    const getter = def.getter();
    return s(getter, opts, getter.isOptional() || getter.isNullable());
  },

Not sure what you mean here, but in any case, I think we have to add our own lazy placeholders in dezerialize in order to successfully replace all references, and there is no way in the current Zod API that I can tell which allows swapping a placeholder like ZodLazy with something else after the fact.

@brettz9 brettz9 force-pushed the recursive branch 2 times, most recently from bd96acf to e44519b Compare July 5, 2024 00:20
@brettz9
Copy link
Contributor Author

brettz9 commented Jul 6, 2024

@Gregoor : I think this one may be ready for review, given that your concern about the jsonref dependency was addressed (by its removal). If you have time/energy for the browser PR (#28) that would also be great.

Copy link
Collaborator

@Gregoor Gregoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you two for duking it out! I trust you two on this one, I just gave it a surface-y skim and it looks mostly good and mergeable to me.

] as const)("zerialize %#", (schema, shape) => {
expect(zerialize(schema)).toEqual(shape);
const zer = zerialize(schema);
// console.log(JSON.stringify(zer, null, 2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fixed.

README.md Outdated
Comment on lines 97 to 100
to use a library like [`json-refs`](https://github.com/whitlockjc/json-refs)
(with `resolveRefs`) to first resolve such references and then supply the object
to `dezerialize`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that still the case?

Copy link
Contributor Author

@brettz9 brettz9 Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's still the case. Btw, I added another JSON refs caveat on the README.

Copy link
Contributor Author

@brettz9 brettz9 Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify further...one must use an external JSON refs library to resolve things like:

{"$ref": "https://example.com/some/Zodex/JSON/file.json"}

or

{"$ref": "/a/Zodex/JSON/file/at/root.json"}

but not for internal references like:

{"$ref": "#/somewhere/in/the/document"}

src/zerialize.ts Outdated
Comment on lines 476 to 499
ZodIntersection: (def, opts) => {
opts.currentPath.push("left");
const left = s(def.left, opts);
opts.currentPath.pop();

opts.currentPath.push("right");
const right = s(def.right, opts);
opts.currentPath.pop();

return {
type: "intersection",
left,
right,
};
},

ZodFunction: (def, opts) => {
opts.currentPath.push("args");
const args = s(def.args, opts);
opts.currentPath.pop();

opts.currentPath.push("returns");
const returns = s(def.returns, opts);
opts.currentPath.pop();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer all of theses instances of parameter mutation to be replaced by constructing new.

e.g.:

ZodIntersection: (def, opts) => ({
    type: "intersection",
    left: s(def.left, { ...opts, currentPath: [...opts.currentPath, "left"] }),
    right: s(def.right, {
      ...opts,
      currentPath: [...opts.currentPath, "right"],
    }),
  }),

imo it makes for more readable, less error prone code. I think the mutative style can make sense for performance, but I don't think that's a chief concern here?!

Thinking about it more, I'm kind of surprised this is even working at all, given that (the way I'm reading this code) is that within a zerialize run all fns share the same reference to a currentPath array that's mutated all over the place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dropped the mutative style. I think the former style worked because it was meticulous in reverting the changes and each step would eventually return, but in any case this works also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a commit to revert back to the terser style of using arrow functions without return statements. I personally find that style cumbersome in debugging and refactoring, but I added the commit if you really do prefer it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dropped the mutative style. I think the former style worked because it was meticulous in reverting the changes and each step would eventually return, but in any case this works also.

It did look meticulous in that way, I did not mean to doubt that.

Yet the reference to the array that was passed in, is also the one that was mutated thereafter. So if anything with the s=>zerializeRefs call were to use or mutate the ref at a later point, wouldn't this cause the spooky distance action that I fear?

I also added a commit to revert back to the terser style of using arrow functions without return statements. I personally find that style cumbersome in debugging and refactoring, but I added the commit if you really do prefer it.

While I do agree that debugging takes more steps, I prefer optimizing for reading on code that is straightforward enough where I don't expect a ton of debugging sessions. So it's a bit of a subjective value judgment.
Though even if, thankfully VSCode added some helpful structural editing commands here: when your cursor is within a terse arrow fn, press Cmd + . (aka "Quick Fix") and hit Enter (on "Add braces to arrow function"). So it's only 3 keypresses to make it more debuggable which I find to be an okay cost

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: arrow functions, sweet, thanks for the tip... Re: the mutating array, I'll admit it's hard to get my head around it. And thanks for the review!

@Gregoor
Copy link
Collaborator

Gregoor commented Jul 9, 2024

Thankslots for your work you two!

@Gregoor Gregoor merged commit 9fafb66 into commonbaseapp:main Jul 9, 2024
5 checks passed
@brettz9 brettz9 deleted the recursive branch July 9, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle recursive schemas
3 participants