Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to supporting Pydantic V2 #203

Merged
merged 81 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
a6f3288
Change Pydantic dependency requirement to `~= 2.4`
candleindark Nov 4, 2023
4395ad4
Merge branch 'master' into pydantic-v2
candleindark Nov 14, 2023
1decc23
Migrate to Pydantic 2.0 using bump-pydantic 0.7.0
candleindark Nov 14, 2023
14f742c
Merge branch 'master' into with-bump-pydantic
candleindark Nov 17, 2023
3b4ecf3
Update from `BaseModel.__fields__` to `BaseModel.model_fields`
candleindark Nov 17, 2023
b211bfd
Remove `DandiBaseModel.json_dict`
candleindark Nov 21, 2023
b697d76
Remove `DandiBaseModel.unvalidated`
candleindark Nov 21, 2023
e0f75ce
Keep `DandiBaseModel.json_dict()`
candleindark Nov 29, 2023
92b5cb9
Keeping `DandiBaseModel.unvalidated()`
candleindark Nov 29, 2023
739db75
Replace the use of `parse_obj_as()` with `TypeAdapter`
candleindark Nov 30, 2023
5528c3d
Replace the use of `update_forward_refs()` with `model_rebuild()`
candleindark Dec 1, 2023
ca3fff5
Replace the use of `BaseModel.json()` with `BaseModel.model_dump_json()`
candleindark Dec 1, 2023
1d85790
Replace the use of `BaseModel.parse_raw()` with `BaseModel.model_vali…
candleindark Dec 1, 2023
5616209
Add additional comments regarding `BaseModel.model_rebuild()`
candleindark Dec 1, 2023
e1adfb1
Migrate the use of `@validator()` to `@field_validator` for `schemaKey`
candleindark Dec 4, 2023
8c0e159
Specify default of `None` for `PropertyValue.value`
candleindark Dec 4, 2023
c21f918
Migrate from `@validator` to `@field_validator` for `PropertyValue.va…
candleindark Dec 4, 2023
3b5fa3a
Migrate from `@validator` to `@field_validator` for `BareAsset.digest`
candleindark Dec 4, 2023
acef6e1
Migrate from `@validator` to `@field_validator` for `PublishedAsset.d…
candleindark Dec 4, 2023
c5b51bb
Migrate from `@root_validator` to `@model_validator` for `Resource`
candleindark Dec 4, 2023
7fcb9be
Migrate from `@root_validator` to `@model_validator` for `AccessRequi…
candleindark Dec 4, 2023
b9604f5
Use `json_schema_extra` to pass extra JSON Schema properties
candleindark Dec 4, 2023
14b0d16
Migrate the use of `BaseModel.schema` and `BaseModel.schema_json` to …
candleindark Dec 5, 2023
fe85089
Migrate the use of `Config` class to `model_config`
candleindark Dec 6, 2023
1d8724f
Migrate to the use of `json_schema_extra` in `pydantic.Field`
candleindark Dec 6, 2023
ba04133
Correct typo in kwarg to `Field`
candleindark Dec 6, 2023
97bf307
Migrate to the use of `BaseModel.model_json_schema` in dynamic calls
candleindark Dec 6, 2023
5b2e6af
Replace use of `DandiBaseModel.unvalidated` with `BaseModel.model_con…
candleindark Dec 6, 2023
6d95f7c
Define `ByteSizeJsonSchema`
candleindark Dec 8, 2023
9668104
Replace use of `ByteSize` with `ByteSizeJsonSchema`
candleindark Dec 8, 2023
57d9600
Update signatures and usage for `post_process_json_schema`
candleindark Dec 8, 2023
16be659
Update Pydantic Metaclass import path
candleindark Dec 8, 2023
b6443ec
Update tests to match new schema path for definitions
candleindark Dec 8, 2023
496d431
Update pydantic ModelMetaclass references in tests
candleindark Dec 8, 2023
a976015
Migrate the use of `BaseModel.__fields__` to `BaseModel.model_fields`
candleindark Dec 8, 2023
64522a6
Adapt to the new representation of a private attribute
candleindark Dec 8, 2023
ecd98ed
Migrate the use of `BaseModel.__fields__` to `BaseModel.model_fields`
candleindark Dec 8, 2023
be176ab
Adapt to the `FieldInfo` the new class for representing a field in `g…
candleindark Dec 8, 2023
6d945a9
Adapt to the `FieldInfo` the new class for representing a field in `t…
candleindark Dec 8, 2023
fc423ae
Adapt to the new representation of private attribute in `test_duplica…
candleindark Dec 8, 2023
d251dc5
Adapt to `FieldInfo` the new class for representing a field in `test_…
candleindark Dec 8, 2023
db55273
Update dandischema/metadata.py
candleindark Dec 8, 2023
439d07b
Update JSON serialization methods in zarr.py
candleindark Dec 8, 2023
fc72189
Remove redundant comment
candleindark Dec 8, 2023
ae9a09b
Provide missing import for `typing.get_origin`
candleindark Dec 8, 2023
0b14166
Rename args in `ByteSizeJsonSchema.__get_pydantic_json_schema__()`
candleindark Dec 8, 2023
7808f87
Replace logic for checking for Pydantic models in `models.py`
candleindark Dec 8, 2023
e095ed7
Eliminate unnecessary JSON dumping and loading in tests
candleindark Dec 8, 2023
50ee928
Merge master with conflict manually resolved
candleindark Dec 9, 2023
82f802f
Use `BaseModel.__get_pydantic_json_schema__()`
candleindark Dec 11, 2023
ab462ec
Add JSON schema to `Pydantic.ByteSize` through annotation
candleindark Dec 11, 2023
0d38117
Add JSON schema to `Pydantic.ByteSize` through annotation
candleindark Dec 11, 2023
2c9d9e3
Implement `TransitionalGenerateJsonSchema`
candleindark Dec 12, 2023
ce49d6a
Utilize `TransitionalGenerateJsonSchema` in JSON schema generation
candleindark Dec 12, 2023
193b4a8
Update `test_asset_digest`
candleindark Dec 12, 2023
545adb9
Update `test_dantimeta_1`
candleindark Dec 12, 2023
328cefa
Update `test_missing_ok`
candleindark Dec 12, 2023
00f0e91
Update `metadata.validate`
candleindark Dec 12, 2023
a23c521
Disable type check for unpacking dict as kwargs
candleindark Dec 13, 2023
d508645
Convert `AnyHttpURL` to a `str`
candleindark Dec 13, 2023
e9579f3
Check for `field.json_schema_extra` being a dict
candleindark Dec 13, 2023
60c51ef
Cast `field.json_schema_extra["nskey"]` to `str`
candleindark Dec 13, 2023
5733137
`ignore[call-arg]` for calling `BaseModel.model_construct()`
candleindark Dec 13, 2023
bbee1dc
Correct type annotation
candleindark Dec 13, 2023
a0e043c
Correct type annotation
candleindark Dec 13, 2023
9676268
Remove unreachable code block
candleindark Dec 13, 2023
1e5a224
Define `utils.strip_top_level_optional`
candleindark Dec 14, 2023
41f5efc
Specify tests for `utils.strip_top_level_optional`
candleindark Dec 14, 2023
1a707e6
Adapt `generate_context` to field annotation in Pydantic V2
candleindark Dec 14, 2023
2fcc005
Merge branch 'master' into pydantic-v2
satra Dec 14, 2023
0c97b8e
Revert marking of `@container` using stringification
candleindark Dec 19, 2023
a104fb2
Merge branch 'master' into pydantic-v2
satra Dec 21, 2023
bddffd8
Boost dandi schema version to 0.7.0 (since exported jsonschema does h…
yarikoptic Jan 16, 2024
e36e003
Added placeholder for where to add migration
yarikoptic Jan 16, 2024
54315ae
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 16, 2024
7eb7272
Remove unneeded code in `TransitionalGenerateJsonSchema.nullable_schema`
candleindark Jan 18, 2024
51d4c53
Remove stub for migration to schema 0.7.0
candleindark Jan 19, 2024
b718dc2
Use JSON Schema validator for Draft 2020-12 in `_validate_obj_json`
candleindark Jan 19, 2024
2e97166
Annotate two possible type of validator in `_validate_obj_json`
candleindark Jan 19, 2024
889c775
Change the current version of the generated schema to 0.6.5 instead
candleindark Jan 19, 2024
2d48a09
Merge branch 'master' into pydantic-v2
yarikoptic Jan 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions dandischema/consts.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
DANDI_SCHEMA_VERSION = "0.6.4"
ALLOWED_INPUT_SCHEMAS = ["0.4.4", "0.5.1", "0.5.2", "0.6.0", "0.6.1", "0.6.2", "0.6.3"]
DANDI_SCHEMA_VERSION = "0.6.5"
ALLOWED_INPUT_SCHEMAS = [
"0.4.4",
"0.5.1",
"0.5.2",
"0.6.0",
"0.6.1",
"0.6.2",
"0.6.3",
"0.6.4",
]

# ATM we allow only for a single target version which is current
# migrate has a guard now for this since it cannot migrate to anything but current
Expand Down
2 changes: 1 addition & 1 deletion dandischema/datacite.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@
else:
continue
elif rel_el.url is not None:
ident_id = rel_el.url
ident_id = str(rel_el.url)

Check warning on line 222 in dandischema/datacite.py

View check run for this annotation

Codecov / codecov/patch

dandischema/datacite.py#L222

Added line #L222 was not covered by tests
ident_tp = "URL"
rel_dict = {
"relatedIdentifier": ident_id.lower(),
Expand Down
13 changes: 10 additions & 3 deletions dandischema/digests/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from functools import total_ordering
import hashlib
import json
import re
from typing import Dict, List, Optional, Tuple

Expand Down Expand Up @@ -109,7 +110,10 @@ def aggregate_digest(self, checksums: ZarrChecksums) -> str:
"""Generate an aggregated digest for a list of ZarrChecksums."""
# Use the most compact separators possible
# content = json.dumps([asdict(zarr_md5) for zarr_md5 in checksums], separators=(',', ':'))0
content = checksums.json(**ENCODING_KWARGS) # type: ignore[arg-type]
content = json.dumps(
checksums.model_dump(mode="json"),
**ENCODING_KWARGS, # type: ignore[arg-type]
)
h = hashlib.md5()
h.update(content.encode("utf-8"))
md5 = h.hexdigest()
Expand All @@ -125,14 +129,17 @@ def aggregate_digest(self, checksums: ZarrChecksums) -> str:
def serialize(self, zarr_checksum_listing: ZarrChecksumListing) -> str:
"""Serialize a ZarrChecksumListing into a string."""
# return json.dumps(asdict(zarr_checksum_listing))
return zarr_checksum_listing.json(**ENCODING_KWARGS) # type: ignore[arg-type]
return json.dumps(
zarr_checksum_listing.model_dump(mode="json"),
**ENCODING_KWARGS, # type: ignore[arg-type]
)

def deserialize(self, json_str: str) -> ZarrChecksumListing:
"""Deserialize a string into a ZarrChecksumListing."""
# listing = ZarrChecksumListing(**json.loads(json_str))
# listing.checksums = [ZarrChecksum(**checksum) for checksum in listing.checksums]
# return listing
return ZarrChecksumListing.parse_raw(json_str)
return ZarrChecksumListing.model_validate_json(json_str)

def generate_listing(
self,
Expand Down
87 changes: 70 additions & 17 deletions dandischema/metadata.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
from copy import deepcopy
from enum import Enum
from functools import lru_cache
from inspect import isclass
import json
from pathlib import Path
from typing import Any, Dict, Iterable, Optional, TypeVar, Union, cast
from typing import Any, Dict, Iterable, Optional, TypeVar, Union, cast, get_args

import jsonschema
import pydantic
Expand All @@ -16,7 +18,12 @@
)
from .exceptions import JsonschemaValidationError, PydanticValidationError
from . import models
from .utils import _ensure_newline, version2tuple
from .utils import (
TransitionalGenerateJsonSchema,
_ensure_newline,
strip_top_level_optional,
version2tuple,
)

schema_map = {
"Dandiset": "dandiset.json",
Expand Down Expand Up @@ -57,29 +64,56 @@ def generate_context() -> dict:
fields: Dict[str, Any] = {}
for val in dir(models):
klass = getattr(models, val)
if not isinstance(klass, pydantic.main.ModelMetaclass):
if not isclass(klass) or not issubclass(klass, pydantic.BaseModel):
continue
if hasattr(klass, "_ldmeta"):
if "nskey" in klass._ldmeta:
if "nskey" in klass._ldmeta.default:
name = klass.__name__
fields[name] = f'{klass._ldmeta["nskey"]}:{name}'
for name, field in klass.__fields__.items():
fields[name] = f'{klass._ldmeta.default["nskey"]}:{name}'
for name, field in klass.model_fields.items():
if name == "id":
fields[name] = "@id"
elif name == "schemaKey":
fields[name] = "@type"
elif name == "digest":
fields[name] = "@nest"
elif name not in fields:
if "nskey" in field.field_info.extra:
fields[name] = {"@id": field.field_info.extra["nskey"] + ":" + name}
if (
isinstance(field.json_schema_extra, dict)
and "nskey" in field.json_schema_extra
):
fields[name] = {
"@id": cast(str, field.json_schema_extra["nskey"]) + ":" + name
}
else:
fields[name] = {"@id": "dandi:" + name}
if "List" in str(field.outer_type_):

# The annotation without the top-level optional
stripped_annotation = strip_top_level_optional(field.annotation)

# Using stringification to detect present of list in annotation is not
# ideal, but it works for now. A better solution should be used in the
# future.
if "list" in str(stripped_annotation).lower():
fields[name]["@container"] = "@set"

# Handle the case where the type of the element of a list is
# an Enum type
type_args = get_args(stripped_annotation)
if (
len(type_args) == 1
and isclass(type_args[0])
and issubclass(type_args[0], Enum)
):
fields[name]["@type"] = "@id"

if name == "contributor":
fields[name]["@container"] = "@list"
if "enum" in str(field.type_) or name in ["url", "hasMember"]:
if (
isclass(stripped_annotation)
and issubclass(stripped_annotation, Enum)
or name in ["url", "hasMember"]
):
fields[name]["@type"] = "@id"

for item in models.DigestType:
Expand All @@ -97,7 +131,14 @@ def publish_model_schemata(releasedir: Union[str, Path]) -> Path:
vdir.mkdir(exist_ok=True, parents=True)
for class_, filename in schema_map.items():
(vdir / filename).write_text(
_ensure_newline(getattr(models, class_).schema_json(indent=2))
_ensure_newline(
json.dumps(
getattr(models, class_).model_json_schema(
schema_generator=TransitionalGenerateJsonSchema
),
indent=2,
)
)
)
(vdir / "context.json").write_text(
_ensure_newline(json.dumps(generate_context(), indent=2))
Expand All @@ -106,9 +147,19 @@ def publish_model_schemata(releasedir: Union[str, Path]) -> Path:


def _validate_obj_json(data: dict, schema: dict, missing_ok: bool = False) -> None:
validator = jsonschema.Draft7Validator(
schema, format_checker=jsonschema.Draft7Validator.FORMAT_CHECKER
)
validator: Union[jsonschema.Draft202012Validator, jsonschema.Draft7Validator]

if version2tuple(data["schemaVersion"]) >= version2tuple("0.6.5"):
# schema version 0.7.0 and above is produced with Pydantic V2
# which is compliant with JSON Schema Draft 2020-12
validator = jsonschema.Draft202012Validator(
schema, format_checker=jsonschema.Draft202012Validator.FORMAT_CHECKER
)
else:
validator = jsonschema.Draft7Validator(
schema, format_checker=jsonschema.Draft7Validator.FORMAT_CHECKER
)

error_list = []
for error in sorted(validator.iter_errors(data), key=str):
if missing_ok and "is a required property" in error.message:
Expand Down Expand Up @@ -186,7 +237,9 @@ def validate(
if json_validation:
if schema_version == DANDI_SCHEMA_VERSION:
klass = getattr(models, schema_key)
schema = klass.schema()
schema = klass.model_json_schema(
schema_generator=TransitionalGenerateJsonSchema
)
else:
if schema_key not in schema_map:
raise ValueError(
Expand All @@ -201,7 +254,7 @@ def validate(
except pydantic.ValidationError as exc:
messages = []
for el in exc.errors():
if not missing_ok or el["msg"] != "field required":
if not missing_ok or el["type"] != "missing":
messages.append(el)
if messages:
raise PydanticValidationError(messages) # type: ignore[arg-type]
Expand Down Expand Up @@ -348,4 +401,4 @@ def aggregate_assets_summary(metadata: Iterable[Dict[str, Any]]) -> dict:
len(stats.pop("tissuesample", [])) + len(stats.pop("slice", []))
) or None
stats["numberOfCells"] = len(stats.pop("cell", [])) or None
return models.AssetsSummary(**stats).json_dict()
return models.AssetsSummary(**stats).model_dump(mode="json", exclude_none=True)
Loading
Loading