-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aliases, and how they are supposed to be used #888
Comments
Hey @nscuro ! Thanks for the very detailed issue! Can you explain a bit more about the exact use case you're trying to achieve here with the OSV data? Are you trying to build your own graph representation? For the OSV schema, we actively avoided trying to make an explicit distinction between a group of vulnerabilities (advisory) vs a single vulnerability to keep things simple. In terms of the end result we want to enable, it's the same -- the ability to identify which package versions are affected and which versions to update to. How we envision a vulnerability scanner working with our data would be this:
Under this workflow, it seems to make sense to group all of the related vulnerabilities together, so users have the full context on what all the vulnerability sources say, and updates/remediation steps account for all relevant entries in the same group. The fact that some of these are "advisories" should not matter -- having them be split up would have the same effect. If this is an issue of semantics and representation, we can certainly ask our data sources to use |
Our use case is not primarily about recommending a fixed version to an end user ("updating to version X will resolve all these issues"), it's more about tracking risk, and making it transparent. So knowing which vulnerabilities are the same and which are not does matter to us. We also have a VEX-like use case, where users (or machines) evaluate whether a project is actually affected by a vulnerability, and record their decision. Obviously we want to avoid redundant work being done, a decision should not have to be recorded for GHSA-28r2-q6m8-9hpx and CVE-2022-30323 separately, as they describe the same thing. On the other hand, we don't want the same decision being applied to different vulnerabilities (CVE-2022-30323 vs. CVE-2022-26945), because the exposure, attack vector, impact etc. may differ. Approaching this use case the other way around, if a vendor provided a VEX document stating that their product is not affected by CVE-2022-30323, this should also be applicable to actual aliases like GHSA-28r2-q6m8-9hpx, but not CVE-2022-26945.
That'd be great! |
Got it, thanks for explaning! Are you thinking of recording VEX on a per package basis, such that users can transitively determine from the entire dependency graph if they're actually indirectly affected by a vulnerability?
We'll start conversations here with Go, and fix up the Debian ones. |
This issue has not had any activity for 60 days and will be automatically closed in two weeks |
Commenting to signal that this issue is still relevant. I am enlightened however to see there is a continuous effort to improve the situation :) |
Thanks! removed the stale tags. |
Hey OSV team, thanks for your great work!
We're currently looking at how we can correlate vulnerabilities that describe the same thing.
As per specification, OSV has the
aliases
field for this:At least in my interpretation, aliasing is a bidirectional relationship that also applies transitively.
If
X
aliasesY
andZ
,Y
should also aliasX
, andY
should also aliasZ
. If they all describe the same thing, that should be a valid assumption.However, in reality, we see that many vulnerability databases (ab-)use the OSV schema to publish advisories. In my understanding, a vulnerability would describe one defect, and that one defect only. Whereas an advisory can potentially refer to multiple vulnerabilities (as in "we patched all these vulnerabilities in version 1.2.3 of our package"). This appears to be a common thing for at least the Go, Rust, and (especially) Debian ecosystems in the OSV database. There are most likely more, but these have been the most obvious candidates to us.
For example, GO-2022-0586 presumably aliases four CVEs and four GHSAs:
These are four different vulnerabilities, with different CWEs, descriptions and severities. CVEs and GHSAs actually alias each other in pairs of two (
GHSA-28r2-q6m8-9hpx
aliasesCVE-2022-30323
, but notCVE-2022-26945
etc.):In cases of advisories like this, the "aliases" are neither bidirectional (
GHSA-28r2-q6m8-9hpx
isn't really the same asGO-2022-0586
), nor are they fully transitive (CVE-2022-26945
is not the same asCVE-2022-30323
). If one was to attempt to find all aliases forGHSA-28r2-q6m8-9hpx
here, traversing this graph would yield wrong results.The Debian ecosystem especially has many of these scenarios, where one DLA can refer to loads of CVEs:
I have the feeling that OSV entries of type "advisory" (maybe such a distinction would be good to have?) should instead use the
related
field. Although I imagine this will be hard to enforce, and even harder to apply in an automated fashion.Am I understanding aliasing in OSV correctly? Is this a data quality issue with the databases that use the OSV schema? Is there anything we can do about it?
The text was updated successfully, but these errors were encountered: