-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use GCS object update time instead of creation time #2454
feat: use GCS object update time instead of creation time #2454
Conversation
This should achieve the same outcome and is easier to manipulate (it doesn't require object recreation) and so should make triggering a reimport of specific records a little more efficient. Tested in Staging: Before: ``` $ gsutil stat gs://osv-test-cve-osv-conversion/osv-output/CVE-2024-7067.json gs://osv-test-cve-osv-conversion/osv-output/CVE-2024-7067.json: Creation time: Fri, 26 Jul 2024 16:53:59 GMT Update time: Wed, 31 Jul 2024 03:52:06 GMT Storage class: STANDARD Content-Length: 3073 Content-Type: application/json Metadata: goog-reserved-file-mtime:1722394200 Hash (crc32c): Lo3Xwg== Hash (md5): rdI8478THJQOVCgblWW9UQ== ETag: COmK0tyVxYcDED4= Generation: 1722012839216489 Metageneration: 62 ``` After: ``` $ gsutil stat gs://osv-test-cve-osv-conversion/osv-output/CVE-2024-7067.json gs://osv-test-cve-osv-conversion/osv-output/CVE-2024-7067.json: Creation time: Fri, 26 Jul 2024 16:53:59 GMT Update time: Tue, 06 Aug 2024 07:07:12 GMT Storage class: STANDARD Content-Length: 3073 Content-Type: application/json Metadata: goog-reserved-file-mtime:1722394200 Hash (crc32c): Lo3Xwg== Hash (md5): rdI8478THJQOVCgblWW9UQ== ETag: COmK0tyVxYcDED8= Generation: 1722012839216489 Metageneration: 63 ``` I have a sneaking suspicion this may facilitate the intentional reimporting of CVEs that get updated when one of their constituent parts changes (the fact that the original creation and update times were divergent surprised me)
I recall that we previously encountered an issue where |
Ah, if that's the (unfortunate) culprit (which I dare say it will be) we may need to also migrate that away from |
There is a subtle and unfortunately annoying difference in behaviour for unchanged files: `gcloud storage rsync` still causes a modification time change to the object in GCS, whereas `gsutil rsync` does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Andrew!
Co-authored-by: Rex P <[email protected]>
This should achieve the same outcome and is easier to manipulate (it doesn't require object recreation) and so should make triggering a reimport of specific records a little more efficient (doesn't require downloading and reuploading the object to reset the creation time)
Tested in Staging:
Before:
After:
I have a sneaking suspicion this may facilitate the intentional reimporting of CVEs that get updated when one of their constituent parts changes (the fact that the original creation and update times were already divergent surprised me)
This also requires
combine-to-osv
to switch back to usinggsutil rsync
instead ofgcloud storage rsync
due to a subtle difference in the treatment of the GCS object modification time for otherwise unchanged objects. (see #2196 for additional context)