Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include BomRef within Component hash calculation #678

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wkoot
Copy link

@wkoot wkoot commented Sep 19, 2024

Closes #753

Fixes #540
Fixes #677

@wkoot wkoot requested a review from a team as a code owner September 19, 2024 11:13
.github/workflows/python.yml Outdated Show resolved Hide resolved
.github/workflows/python.yml Outdated Show resolved Hide resolved
@jkowalleck
Copy link
Member

this is considered a breaking change, as it changes existing behavior in a non-backwards-compatible way

json = json_loads(input_json.read())

bom = Bom.from_json(json)
self.assertEqual(4, len(bom.components))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a disputable assertion.
the two of the components of the original BOM were duplicated in the first place, why keep them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the correct place for this to be disputed, is it part of the implementation or the spec discussion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is something that could be discussed here: https://github.com/CycloneDX/specification/discussions

Copy link
Author

@wkoot wkoot Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where in the spec is it explained what the uniqueness criteria for components are?
And is it a bug that the libraries in other programming languages have implemented this differently, or is that also something that's an ongoing discussion?

@jkowalleck
Copy link
Member

this PR claims to solve #677
the original issue was

>>> bom.validate()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lib\site-packages\cyclonedx\model\bom.py", line 666, in validate
    raise UnknownComponentDependencyException(
cyclonedx.exception.model.UnknownComponentDependencyException: One or more Components have Dependency references to Components/Services that are not known in this BOM. They are: {<BomRef 'test22' id=2111773432208>, <BomRef 'test21' id=2111773432160>}

none of the added tests showcase this broken/fixed behaviour.

@jkowalleck
Copy link
Member

this PR claims to solve #540

the original problem was

Traceback (most recent call last):
  File "/Users/tek30584/programming/cdx_lib_bugs/./duplicate_name_bug.py", line 38, in <module>
    print(JsonV1Dot5(bom).output_as_string(indent=2))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/output/json.py", line 82, in output_as_string
    self.generate()
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/output/json.py", line 70, in generate
    bom.validate()
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/model/bom.py", line 600, in validate
    raise UnknownComponentDependencyException(
cyclonedx.exception.model.UnknownComponentDependencyException: One or more Components have Dependency references to Components/Services that are not known in this BOM. They are: {<BomRef 'some-library2'>}

none of the added tests showcase this broken/fixed behaviour.

@jkowalleck jkowalleck marked this pull request as draft September 19, 2024 11:23
@wkoot wkoot force-pushed the wkoot-patch-1 branch 3 times, most recently from 1caaaf5 to 229585e Compare September 19, 2024 11:49
@@ -1783,7 +1783,7 @@ def __hash__(self) -> int:
self.mime_type, self.supplier, self.author, self.publisher,
self.description, self.scope, tuple(self.hashes),
tuple(self.licenses), self.copyright, self.cpe,
self.purl,
self.purl, self.bom_ref.value,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you go with self.bom_ref.value instead of self.bom_ref.
BomRef.__hash__ exists and is implemented to account for the __id__ of the "empty" object entity, so that None values are unequal.
Your implementation does not do so, and may cause unexpected behaviour here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I had originally only added self.bom_ref, but that would then not deduplicate the Nones. Is there a reason for the BomRef hash to include the id? It seems superfluous when there is a unique identifier. Note also that the code claims a random UUIDv4 should have been assigned when there is no ref, but this is not actually the case

@wkoot
Copy link
Author

wkoot commented Sep 20, 2024

this is considered a breaking change, as it changes existing behavior in a non-backwards-compatible way

Would it be desirable to only include the BomRef in the hash calc when it is set (i.e. not None)?
I then wonder what should be done about duplicate components where one has a ref and the other does not.

@jkowalleck
Copy link
Member

this is considered a breaking change, as it changes existing behavior in a non-backwards-compatible way

Would it be desirable to only include the BomRef in the hash calc when it is set (i.e. not None)? I then wonder what should be done about duplicate components where one has a ref and the other does not.

breaking changes are embraced and not a stopper. it was just a remark.

@jkowalleck
Copy link
Member

jkowalleck commented Jan 15, 2025

@wkoot , I just wanted to ask how things are going along.
Do you need any help with the open discussions/points?

@wkoot
Copy link
Author

wkoot commented Jan 15, 2025

@jkowalleck sadly, I currently don't have any more time allotted for sbom related stuff :(

@jkowalleck
Copy link
Member

I will take this over, to fix #753.
I will work in a branch derived from yours, so all the contributor's attribution should stay intact.

Thanks for kicking this off, @wkoot 👍

@jkowalleck
Copy link
Member

superseded by #754

@jkowalleck jkowalleck closed this Jan 17, 2025
@jkowalleck jkowalleck reopened this Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants