Cannot convert with `from_attributes` when using a rename convention #630

LonelyVikingMichael · 2024-01-16T15:07:59Z

Description

Some example code using a Struct with rename="camel":

from dataclasses import dataclass

from msgspec import Struct, convert


@dataclass
class ThingModel:
    thing_id: str


class Thing(Struct, rename="camel"):
    thing_id: str


tm = ThingModel(thing_id="123")
t = convert(tm, Thing, from_attributes=True)

Raises:

Traceback (most recent call last):
  File "/app/backend/smalltest.py", line 16, in <module>
    t = convert(tm, Thing, from_attributes=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
msgspec.ValidationError: Object missing required field `thingId`

Is there a way around this? For this specific example, we can leverage unpacking the dataclass' asdict() result, but this isn't always straightforward with an ORM instance. I actually discovered this behaviour using a SQLAlchemy model.

The text was updated successfully, but these errors were encountered:

jcrist · 2024-01-17T16:57:53Z

Thanks for opening this.

First, standardizing on terminology. Given your field thing_id above, "thing_id" is the attribute name, "thingId" is the renamed name.

Originally from_attributes was implemented to always use the attribute names rather than the renamed names (so you'd get the behavior you wanted here). I ended up switching it to use the renamed names for "uniformity" in 0f32524. Looking back I'm not sure if this was the best decision. Some random thoughts:

There are 3 kinds of things that may be converted into a struct/dataclass from by convert:

A dict. This is assumed to have come from some serialization protocol (the main point of msgspec). The renamed fields are always used, assuming that the renaming is done to match field naming conventions in that protocol.
A non-dict mapping. The assumption is this is some DB-related record (from this request). Here using the renamed fields may be nice to translate between the DB column names and the struct field names. But since you can alias the field names in a query maybe this should also always use the attribute names? Or maybe it should always use the renamed names since a mapping is dict-like and that's what dicts use?
An object with attributes, with from_attributes=True. The assumption is this is also some ORM-like DB-related record (from this request). Since this is by attribute the attributes must be valid attribute names - perhaps these should always use the attribute names? On the other hand diverging from 2 above which has the same use case may be confusing.

A few options:

Have objects-with-attributes always use the attribute names instead of the renamed names. This would be a breaking change, but we're still 0.* so those are acceptable when needed.
Have objects-with-attributes try the renamed names first then fallback to trying the attribute names before erroring (or maybe the other way around?). Maybe also apply to mappings? (I think no?). Right now I think this is my favorite option.
Add a new option to also accept the attribute names. This would be similar to pydantic's populate_by_name, but would probably be a kwarg to convert rather than an option on the struct type. Would this apply to dicts and mappings as well? I'm not sure. I'm reluctant to add a new option here, but could if needed.
Add a new option to use the attribute names instead of the renamed names. by_name=True or something. This wouldn't work well for nested types if there are realistic use cases where you might want to use the original names for some objects and the renamed names for others. Like the above, I'm reluctant to add a new config option here.

Thoughts?

LonelyVikingMichael · 2024-01-18T08:06:35Z

Hi Jim, thanks for taking the time to provide context.

In my own use cases (mostly REST APIs), I try to have everything internal to my application use snake_case, whereas the public JSON schemas use camelCase. However, this linked issue comment outlines that this isn't always the case.

As for solutions, having convert falling back on attribute names or vice versa would be fine by me, though this might be a problem in applications where performance is critical? There's probably no "right" choice between attribute names first or renamed names first, we can make valid arguments for both.

For what it is worth, Pydantic's by_alias/populate_by_name options have been helpful to me in the past, especially when integrating to third parties where naming is pre-defined, but I respect your stance on yet another config option.

jcrist · 2024-01-22T03:29:49Z

This is fixed by #636. I ended up going for the inference option - in practice the inference is pretty efficient and only results in at most 1 unnecessary getattr call. I opted to go for prioritizing the attribute names over the renamed names here, since in most cases converting an arbitrary object by attribute will want to use the attribute names.

jcrist mentioned this issue Jan 22, 2024

Infer naming convention when converting objects to structs #636

Merged

jcrist closed this as completed in #636 Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot convert with `from_attributes` when using a rename convention #630

Cannot convert with `from_attributes` when using a rename convention #630

LonelyVikingMichael commented Jan 16, 2024

jcrist commented Jan 17, 2024 •

edited

Loading

LonelyVikingMichael commented Jan 18, 2024

jcrist commented Jan 22, 2024

Cannot convert with from_attributes when using a rename convention #630

Cannot convert with from_attributes when using a rename convention #630

Comments

LonelyVikingMichael commented Jan 16, 2024

Description

jcrist commented Jan 17, 2024 • edited Loading

LonelyVikingMichael commented Jan 18, 2024

jcrist commented Jan 22, 2024

Cannot convert with `from_attributes` when using a rename convention #630

Cannot convert with `from_attributes` when using a rename convention #630

jcrist commented Jan 17, 2024 •

edited

Loading