Skip to content

Commit

Permalink
feat: create script for updating lists of ecosystems (#303)
Browse files Browse the repository at this point in the history
This introduces a new script to make it easier to ensure all "lists of
ecosystems" within this codebase remain up to date, including:
  - the table in `docs/schema.md` 
  - the Go constants being introduced in #292
- the JSON schema in `validation/schema.json` (both the pattern and the
enum being introduced in #296)

To make it a bit easier, I've introduced a top-level `ecosystems.json`
which is a map of defined ecosystems and a markdown description, sorted
alphabetically (which the script also ensures) - I felt this was easier
than trying to extract the list from markdown or another source, though
it does mean double quotes need to be manually escaped.

I went with JSON as it can be read without requiring an external
dependency, though if we use Python 3.11 we could switch to `toml`
instead as that ships with `tomllib`

Example of the workflow output:


![image](https://github.com/user-attachments/assets/aaff0cd4-6387-497f-9869-62ac1b839e58)


![image](https://github.com/user-attachments/assets/057fb2e1-e2ca-4f9b-a704-c116ed69a69f)

---------

Signed-off-by: Gareth Jones <[email protected]>
  • Loading branch information
G-Rath authored Nov 5, 2024
1 parent 29f64c7 commit c767f97
Show file tree
Hide file tree
Showing 6 changed files with 236 additions and 27 deletions.
19 changes: 19 additions & 0 deletions .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,22 @@ jobs:
check-latest: true

- run: go test ./...

ecosystem_lists:
permissions:
contents: read # to fetch code (actions/checkout)
name: Check ecosystem lists
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871 # v4.2.1
with:
persist-credentials: false
- uses: actions/setup-python@f677139bbe7f9c59b41e40162b753c062f5d49a3 # v5.2.0
with:
python-version: 3.13
- run: python3 ./scripts/update-ecosystems-lists.py
- run: |
git diff --name-only \
| xargs -I '{}' bash -c \
'echo "::error file={}::This needs to be regenerated by running \`python3 ./scripts/update-ecosystems-lists.py\`" && false'
7 changes: 7 additions & 0 deletions bindings/go/osvschema/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,22 @@ const (
EcosystemAndroid Ecosystem = "Android"
EcosystemBioconductor Ecosystem = "Bioconductor"
EcosystemBitnami Ecosystem = "Bitnami"
EcosystemChainguard Ecosystem = "Chainguard"
EcosystemConanCenter Ecosystem = "ConanCenter"
EcosystemCRAN Ecosystem = "CRAN"
EcosystemCratesIO Ecosystem = "crates.io"
EcosystemDebian Ecosystem = "Debian"
EcosystemGHC Ecosystem = "GHC"
EcosystemGitHubActions Ecosystem = "GitHub Actions"
EcosystemGo Ecosystem = "Go"
EcosystemHackage Ecosystem = "Hackage"
EcosystemHex Ecosystem = "Hex"
EcosystemLinux Ecosystem = "Linux"
EcosystemMageia Ecosystem = "Mageia"
EcosystemMaven Ecosystem = "Maven"
EcosystemNPM Ecosystem = "npm"
EcosystemNuGet Ecosystem = "NuGet"
EcosystemOpenSUSE Ecosystem = "openSUSE"
EcosystemOSSFuzz Ecosystem = "OSS-Fuzz"
EcosystemPackagist Ecosystem = "Packagist"
EcosystemPhotonOS Ecosystem = "Photon OS"
Expand All @@ -27,8 +32,10 @@ const (
EcosystemRedHat Ecosystem = "Red Hat"
EcosystemRockyLinux Ecosystem = "Rocky Linux"
EcosystemRubyGems Ecosystem = "RubyGems"
EcosystemSUSE Ecosystem = "SUSE"
EcosystemSwiftURL Ecosystem = "SwiftURL"
EcosystemUbuntu Ecosystem = "Ubuntu"
EcosystemWolfi Ecosystem = "Wolfi"
)

type SeverityType string
Expand Down
54 changes: 29 additions & 25 deletions docs/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -668,7 +668,7 @@ within its ecosystem. The two fields must both be present, because the
The `purl` field is a string following the
[Package URL specification](https://github.com/package-url/purl-spec) that
identifies the package, without the `@version` component.
This field is optional but recommended.
This field is optional but recommended.

Different ecosystems can define the same names; they identify different
packages. For example, these denote different libraries with different sets of
Expand All @@ -684,46 +684,50 @@ versions and different potential vulnerabilities:

#### Defined ecosystems

<!-- Please keep this list alphabetically sorted -->
The defined ecosystems are:

| Ecosystem | Description |
| --------- |-----------------|
| `AlmaLinux` | AlmaLinux package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular AlmaLinux release. `<RELEASE>` is a numeric version.
<!-- (re)generate this list using scripts/update-ecosystems.list.py after changing ecosystems.json -->
<!-- begin auto-generated ecosystems list -->

| Ecosystem | Description |
|-----------|-------------|
| `AlmaLinux` | AlmaLinux package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular AlmaLinux release. `<RELEASE>` is a numeric version. |
| `Alpine` | The Alpine package ecosystem; the `name` is the name of the source package. The ecosystem string must have a `:v<RELEASE-NUMBER>` suffix to scope the package to a particular Alpine release branch (the `v` prefix is required). E.g. `v3.16`. |
| `Android` | The Android ecosystem. Android organizes code using [`repo` tool](https://gerrit.googlesource.com/git-repo/+/HEAD/README.md), which manages multiple git projects under one or more remote git servers, where each project is identified by its name in [repo configuration](https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md#Element-project) (e.g. `platform/frameworks/base`). The `name` field should contain the name of that affected git project/submodule. One exception is when the project contains the Linux kernel source code, in which case `name` field will be `:linux_kernel:`, followed by an optional SoC vendor name e.g. `:linux_kernel:Qualcomm`. The list of recognized SoC vendors is listed in the [Appendix](#android-soc-vendors) |
| `Android` | The Android ecosystem. Android organizes code using [`repo` tool](https://gerrit.googlesource.com/git-repo/+/HEAD/README.md), which manages multiple git projects under one or more remote git servers, where each project is identified by its name in [repo configuration](https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md#Element-project) (e.g. `platform/frameworks/base`). The `name` field should contain the name of that affected git project/submodule. One exception is when the project contains the Linux kernel source code, in which case `name` field will be `:linux_kernel:`, followed by an optional SoC vendor name e.g. `:linux_kernel:Qualcomm`. The list of recognized SoC vendors is listed in the [Appendix](#android-soc-vendors) |
| `Bioconductor` | The biological R package ecosystem. The `name` is an R package name. |
| `Bitnami` | Bitnami package ecosystem; the `name` is the name of the affected component. |
| `Chainguard` | The Chainguard package ecosystem; the `name` is the name of the package. |
| `ConanCenter` | The ConanCenter ecosystem for C and C++; the `name` field is a Conan package name. |
| `ConanCenter` | The ConanCenter ecosystem for C and C++; the `name` field is a Conan package name. |
| `CRAN` | The R package ecosystem. The `name` is an R package name. |
| `crates.io` | The crates.io ecosystem for Rust; the `name` field is a crate name. |
| `Debian` | The Debian package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular Debian release. `<RELEASE>` is a numeric version specified in the [Debian distro-info-data](https://debian.pages.debian.net/distro-info-data/debian.csv). For example, the ecosystem string "Debian:7" refers to the Debian 7 (wheezy) release. |
| `GHC` | The Haskell compiler ecosystem. The `name` field is the name of a component of the GHC compiler ecosystem (e.g., compiler, GHCI, RTS). |
| `crates.io` | The crates.io ecosystem for Rust; the `name` field is a crate name. |
| `Debian` | The Debian package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular Debian release. `<RELEASE>` is a numeric version specified in the [Debian distro-info-data](https://debian.pages.debian.net/distro-info-data/debian.csv). For example, the ecosystem string "Debian:7" refers to the Debian 7 (wheezy) release. |
| `GHC` | The Haskell compiler ecosystem. The `name` field is the name of a component of the GHC compiler ecosystem (e.g., compiler, GHCI, RTS). |
| `GitHub Actions` | The GitHub Actions ecosystem; the `name` field is the action's repository name with owner e.g. `{owner}/{repo}`. |
| `Go` | The Go ecosystem; the `name` field is a Go module path. |
| `Hackage` | The Haskell package ecosystem. The `name` field is a Haskell package name as published on Hackage. |
| `Hex` | The package manager for the Erlang ecosystem; the `name` is a Hex package name. |
| `Go` | The Go ecosystem; the `name` field is a Go module path. |
| `Hackage` | The Haskell package ecosystem. The `name` field is a Haskell package name as published on Hackage. |
| `Hex` | The package manager for the Erlang ecosystem; the `name` is a Hex package name. |
| `Linux` | The Linux kernel. The only supported `name` is `Kernel`. |
| `Mageia` | The Mageia Linux package ecosystem; the `name` is the name of the source package. The ecosystem string must have a `:<RELEASE-NUMBER>` suffix to scope the package to a particular Mageia release. Eg `Mageia:9`. |
| `Maven` | The Maven Java package ecosystem. The `name` field is a Maven package name in the format `groupId:artifactId`. The ecosystem string might optionally have a `:<REMOTE-REPO-URL>` suffix to denote the remote repository URL that best represents the source of truth for this package, without a trailing slash (e.g. `Maven:https://maven.google.com`). If this is omitted, this is assumed to be the Maven Central repository (`https://repo.maven.apache.org/maven2`).
| `npm` | The NPM ecosystem; the `name` field is an NPM package name. |
| `NuGet` | The NuGet package ecosystem. The `name` field is a NuGet package name. |
| `OSS-Fuzz` | For reports from the OSS-Fuzz project that have no more appropriate ecosystem; the `name` field is the name assigned by the OSS-Fuzz project, as recorded in the submitted fuzzing configuration. |
| `openSUSE` | The openSUSE ecosystem; The ecosystem string has a `:<RELEASE>` suffix presenting the marketing name of the openSUSE distribution. `<RELEASE>` matches the value in the `/etc/os-release` `PRETTY_NAME` field. The `name` field is the name of the source RPM and accompanied by a purl. There is an `ecosystem_specific` specific array `binaries` of the associated RPM binary packages in this specific openSUSE distribution. The ECOSYSTEM version ordering is the RPM versioncompare ordering, and the database uses the `introduced` and `fixed` boundaries.|
| `Packagist` | The PHP package manager ecosystem; the `name` is a package name. |
| `Maven` | The Maven Java package ecosystem. The `name` field is a Maven package name in the format `groupId:artifactId`. The ecosystem string might optionally have a `:<REMOTE-REPO-URL>` suffix to denote the remote repository URL that best represents the source of truth for this package, without a trailing slash (e.g. `Maven:https://maven.google.com`). If this is omitted, this is assumed to be the Maven Central repository (`https://repo.maven.apache.org/maven2`). |
| `npm` | The NPM ecosystem; the `name` field is an NPM package name. |
| `NuGet` | The NuGet package ecosystem. The `name` field is a NuGet package name. |
| `openSUSE` | The openSUSE ecosystem; The ecosystem string has a `:<RELEASE>` suffix presenting the marketing name of the openSUSE distribution. `<RELEASE>` matches the value in the `/etc/os-release` `PRETTY_NAME` field. The `name` field is the name of the source RPM and accompanied by a purl. There is an `ecosystem_specific` specific array `binaries` of the associated RPM binary packages in this specific openSUSE distribution. The ECOSYSTEM version ordering is the RPM versioncompare ordering, and the database uses the `introduced` and `fixed` boundaries. |
| `OSS-Fuzz` | For reports from the OSS-Fuzz project that have no more appropriate ecosystem; the `name` field is the name assigned by the OSS-Fuzz project, as recorded in the submitted fuzzing configuration. |
| `Packagist` | The PHP package manager ecosystem; the `name` is a package name. |
| `Photon OS` | The Photon OS package ecosystem; the `name` is the name of the RPM package. The ecosystem string must have a `:<RELEASE-NUMBER>` suffix to scope the package to a particular Photon OS release. Eg `Photon OS:3.0`. |
| `Pub` | The package manager for the Dart ecosystem; the `name` field is a Dart package name. |
| `PyPI` | the Python PyPI ecosystem; the `name` field is a [normalized](https://www.python.org/dev/peps/pep-0503/#normalized-names) PyPI package name. |
| `Red Hat` | The Red Hat package ecosystem; the `name` field is the name of a binary or source RPM. The ecosystem string has a `:<CPE>` suffix to scope the RPM to a specific Red Hat product stream. `<CPE>` is a translation of a Red Hat [Common Platform Enumerations](https://cpe.mitre.org/) (CPE) with the `cpe/:[oa]:(redhat):` prefix removed (for example, `Red Hat:rhel_aus:8.4::appstream` translates to `cpe:/a:redhat:rhel_aus:8.4::appstream`). Red Hat ecosystem identifiers can be used to identify vulnerable RPMs installed on a Red Hat system as explained [here](https://www.redhat.com/en/blog/how-accurately-match-oval-security-data-installed-rpms). |
| `Rocky Linux` | The Rocky Linux package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular Rocky Linux release. `<RELEASE>` is a numeric version.
| `RubyGems` | The RubyGems ecosystem; the `name` field is a gem name. |
| `SUSE` | The SUSE ecosystem; The ecosystem string has a `:<RELEASE>` suffix representing the marketing name of the SUSE product. `<RELEASE>` matches the value in the /etc/os-release `PRETTY_NAME` field. The `name` field is the name of the source RPM and accompanied by a purl. There is a `ecosystem_specific` specific array `binaries` of the associated RPM binary packages in this specific SUSE product. The ECOSYSTEM version ordering is the RPM versioncompare ordering, and the database uses the `introduced` and `fixed` boundaries.|
| `PyPI` | the Python PyPI ecosystem; the `name` field is a [normalized](https://www.python.org/dev/peps/pep-0503/#normalized-names) PyPI package name. |
| `Red Hat` | The Red Hat package ecosystem; the `name` field is the name of a binary or source RPM. The ecosystem string has a `:<CPE>` suffix to scope the RPM to a specific Red Hat product stream. `<CPE>` is a translation of a Red Hat [Common Platform Enumerations](https://cpe.mitre.org/) (CPE) with the `cpe/:[oa]:(redhat):` prefix removed (for example, `Red Hat:rhel_aus:8.4::appstream` translates to `cpe:/a:redhat:rhel_aus:8.4::appstream`). Red Hat ecosystem identifiers can be used to identify vulnerable RPMs installed on a Red Hat system as explained [here](https://www.redhat.com/en/blog/how-accurately-match-oval-security-data-installed-rpms). |
| `Rocky Linux` | The Rocky Linux package ecosystem; the `name` is the name of the source package. The ecosystem string might optionally have a `:<RELEASE>` suffix to scope the package to a particular Rocky Linux release. `<RELEASE>` is a numeric version. |
| `RubyGems` | The RubyGems ecosystem; the `name` field is a gem name. |
| `SUSE` | The SUSE ecosystem; The ecosystem string has a `:<RELEASE>` suffix representing the marketing name of the SUSE product. `<RELEASE>` matches the value in the /etc/os-release `PRETTY_NAME` field. The `name` field is the name of the source RPM and accompanied by a purl. There is a `ecosystem_specific` specific array `binaries` of the associated RPM binary packages in this specific SUSE product. The ECOSYSTEM version ordering is the RPM versioncompare ordering, and the database uses the `introduced` and `fixed` boundaries. |
| `SwiftURL` | The Swift Package Manager ecosystem. The `name` is a Git URL to the source of the package. Versions are Git tags that comform to [SemVer 2.0](https://docs.swift.org/package-manager/PackageDescription/PackageDescription.html#version). |
| `Ubuntu` | The Ubuntu package ecosystem; the `name` field is the name of the source package. The ecosystem string has a `:<RELEASE>` suffix to scope the package to a particular Ubuntu release. `<RELEASE>` is a numeric ("YY.MM") version as specified in [Ubuntu Releases](https://wiki.ubuntu.com/Releases), with a mandatory `:LTS` suffix if the release is marked as LTS. The release version may also be prefixed with `:Pro:` to denote Ubuntu Pro (aka Expanded Security Maintenance (ESM)) updates. For example, the ecosystem string "Ubuntu:22.04:LTS" refers to Ubuntu 22.04 LTS (jammy), while "Ubuntu:Pro:18.04:LTS" refers to fixes that landed in Ubuntu 18.04 LTS (bionic) under Ubuntu Pro/ESM.
| `Ubuntu` | The Ubuntu package ecosystem; the `name` field is the name of the source package. The ecosystem string has a `:<RELEASE>` suffix to scope the package to a particular Ubuntu release. `<RELEASE>` is a numeric ("YY.MM") version as specified in [Ubuntu Releases](https://wiki.ubuntu.com/Releases), with a mandatory `:LTS` suffix if the release is marked as LTS. The release version may also be prefixed with `:Pro:` to denote Ubuntu Pro (aka Expanded Security Maintenance (ESM)) updates. For example, the ecosystem string "Ubuntu:22.04:LTS" refers to Ubuntu 22.04 LTS (jammy), while "Ubuntu:Pro:18.04:LTS" refers to fixes that landed in Ubuntu 18.04 LTS (bionic) under Ubuntu Pro/ESM. |
| `Wolfi` | The Wolfi package ecosystem; the `name` is the name of the package. |
| Your ecosystem here. | [Send us a PR](https://github.com/ossf/osv-schema/compare). |

<!-- end auto-generated ecosystems list -->

It is permitted for a database name (the DB prefix in the `id` field) and an
ecosystem name to be the same, provided they have the same owner who can make
decisions about the meaning of the `ecosystem_specific` field (see below).
Expand Down
Loading

0 comments on commit c767f97

Please sign in to comment.