Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 639 followup #17105

Closed
ewdurbin opened this issue Nov 18, 2024 · 7 comments · Fixed by #17106
Closed

PEP 639 followup #17105

ewdurbin opened this issue Nov 18, 2024 · 7 comments · Fixed by #17106
Labels
APIs/feeds bug 🐛 needs discussion a product management/policy issue maintainers and users should discuss

Comments

@ewdurbin
Copy link
Member

I noted that the first upload using Metadata 2.4/PEP 639 came in a few hours ago for retry-toolkit at https://pypi.org/project/retry-toolkit/0.6.1/.

Inspecting the result, License-Expression was successfully stored, but handling of License-File metadata appears to have been mishandled.

Metadata for sdist:

License-File: LICENSE

and whl:

License-File: LICENSE

On closer inspection, I see that it was uploaded using pypa/hatch (which, sidenote, sends a generic httpx user-agent)...

Publishing locally to my dev environment I see that the post body has the key license_file:

Image

pypa/packaging and our implementation talk in license_files.

CCing in @ofek @pradyunsg and @brettcannon for input on what the "correct" POST key should be.

@ewdurbin ewdurbin added APIs/feeds bug 🐛 needs discussion a product management/policy issue maintainers and users should discuss labels Nov 18, 2024
@ofek
Copy link
Contributor

ofek commented Nov 18, 2024

@di
Copy link
Member

di commented Nov 18, 2024

The transition from raw metadata to the Metadata object seems to be correct:

>>> Metadata.from_email(b'Metadata-Version: 2.4\nVersion: 1.0\nName: Throwaway\nLicense-File: Something\nLicense-File: Something Else').license_files
['Something', 'Something Else']

So it would seem that something is wrong in our translation from Metadata object to a Release?

@ofek
Copy link
Contributor

ofek commented Nov 18, 2024

Yeah the path is relative to the project root so Hatchling has the correct behavior indeed https://peps.python.org/pep-0639/#add-license-file-field

@di
Copy link
Member

di commented Nov 18, 2024

Yeah, something's wrong in the way we are parsing this from the form:

diff --git a/tests/unit/forklift/test_metadata.py b/tests/unit/forklift/test_metadata.py
index e9e7a8c81..92335d3ac 100644
--- a/tests/unit/forklift/test_metadata.py
+++ b/tests/unit/forklift/test_metadata.py
@@ -29,15 +29,26 @@ def _assert_invalid_metadata(exc, field):

 class TestParse:
     def test_valid_from_file(self):
-        meta = metadata.parse(b"Metadata-Version: 2.1\nName: foo\nVersion: 1.0\n")
+        meta = metadata.parse(
+            b"Metadata-Version: 2.4\nName: foo\nVersion: 1.0\nLicense-File: Something\nLicense-File: Something Else"
+        )
         assert meta.name == "foo"
         assert meta.version == Version("1.0")
+        assert meta.license_files == [
+            "Something",
+            "Something Else",
+        ]

     def test_valid_from_form(self):
-        data = MultiDict(metadata_version="2.1", name="spam", version="2.0")
+        data = MultiDict(metadata_version="2.4", name="spam", version="2.0")
+        data.extend([("license_file", "Something"), ("license_file", "Something Else")])
         meta = metadata.parse(None, form_data=data)
         assert meta.name == "spam"
         assert meta.version == Version("2.0")
+        assert meta.license_files == [
+            "Something",
+            "Something Else",
+        ]

     def test_invalid_no_data(self):
         with pytest.raises(metadata.NoMetadataError):

Results in:

    def test_valid_from_form(self):
        data = MultiDict(metadata_version="2.4", name="spam", version="2.0")
        data.extend([("license_file", "Something"), ("license_file", "Something Else")])
        meta = metadata.parse(None, form_data=data)
        assert meta.name == "spam"
        assert meta.version == Version("2.0")
>       assert meta.license_files == [
            "Something",
            "Something Else",
        ]
E       AssertionError: assert equals failed
E         None                 [
E                                'Something',
E                                'Something Else',
E                              ]

tests/unit/forklift/test_metadata.py:48: AssertionError

(Note that the version assertion is passing as well?) edit: wrong version

@di
Copy link
Member

di commented Nov 18, 2024

I think we need to add a special case here:

for name in frozenset(data.keys()):
# We have to be lenient in the face of "extra" data, because the data
# value here might contain unrelated form data, so we'll skip thing for
# fields that aren't in our list of values.
raw_name = _FORM_TO_RAW_MAPPING.get(name)
if raw_name is None:
continue
# We use getall() here, even for fields that aren't multiple use,
# because otherwise someone could have e.g. two Name fields, and we
# would just silently ignore it rather than doing something about it.
value = data.getall(name) or []
# An empty string is invalid for all fields, treat it as if it wasn't
# provided in the first place
if value == [""]:
continue
# If this is one of our string fields, then we'll check to see if our
# value is a list of a single item. If it is then we'll assume that
# it was emitted as a single string, and unwrap the str from inside
# the list.
#
# If it's any other kind of data, then we haven't the faintest clue
# what we should parse it as, and we have to just add it to our list
# of unparsed stuff.
if raw_name in _STRING_FIELDS and len(value) == 1:
raw[raw_name] = value[0]
# If this is one of our list of string fields, then we can just assign
# the value, since forms *only* have strings, and our getall() call
# above ensures that this is a list.
elif raw_name in _LIST_FIELDS:
raw[raw_name] = value
# Special Case: Keywords
# The keywords field is implemented in the metadata spec as a str,
# but it conceptually is a list of strings, and is serialized using
# ", ".join(keywords), so we'll do some light data massaging to turn
# this into what it logically is.
elif raw_name == "keywords" and len(value) == 1:
raw[raw_name] = _parse_keywords(value[0])
# Special Case: Project-URL
# The project urls is implemented in the metadata spec as a list of
# specially-formatted strings that represent a key and a value, which
# is fundamentally a mapping, however the email format doesn't support
# mappings in a sane way, so it was crammed into a list of strings
# instead.
#
# We will do a little light data massaging to turn this into a map as
# it logically should be.
elif raw_name == "project_urls":
try:
raw[raw_name] = _parse_project_urls(value)
except KeyError:
unparsed[name] = value
# Nothing that we've done has managed to parse this, so it'll just
# throw it in our unparseable data and move on.
else:
unparsed[name] = value

@ewdurbin
Copy link
Member Author

@di something like this? #17106

@ewdurbin
Copy link
Member Author

Yeah, so looks like hatch is doing the right thing @ofek, I was just easily confused by the pluralization stuff... which is pretty inconsistent on both sides of specifications 😮‍💨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APIs/feeds bug 🐛 needs discussion a product management/policy issue maintainers and users should discuss
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants