You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the same as #8377 and #7011: aws s3 sync should be able to detect files that need synchronization based on the checksum stored in S3, not size/timestamp.
Both issues were closed in favor of #6750 but #6750 only goes half the way: it uploads the checksum as metadata, but doesn't take it into account when computing the sync candidates.
Use Case
A prototypical use case looks like
generate files (e.g., documentation) from a source in CI
sync the output to S3.
Because the files are generated in CI, they always have the current timestamp, so all of them will be synced to S3, even though only few of them may have changed. --size-only is not a viable alternative, as it skips, for example, typo fixes that often don't change the file size.
Proposed Solution
Add a flag --checksum-only or the like. If that flag is present, rather than retrieving file timestamp and size from S3 and comparing it to the local values, retrieve the checksum that was stored during the previous upload using --checksum-algorithm and compare it to the locally computed value.
Other Information
Current behavior:
$ aws s3 sync --checksum-algorithm=SHA256 testdata s3://my-bucket
upload: testdata/bar.txt to s3://my-bucket/bar.txt
upload: testdata/foo.txt to s3://my-bucket/foo.txt
upload: testdata/hoge/yq_linux_amd64 to s3://my-bucket/hoge/yq_linux_amd64
$ touch testdata/hoge/yq_linux_amd64
$ aws s3 sync --checksum-algorithm=SHA256 testdata s3://my-bucket
upload: testdata/hoge/yq_linux_amd64 to s3://my-bucket/hoge/yq_linux_amd64
Desired behavior:
$ aws s3 sync --checksum-only --checksum-algorithm=SHA256 testdata s3://my-bucket
upload: testdata/bar.txt to s3://my-bucket/bar.txt
upload: testdata/foo.txt to s3://my-bucket/foo.txt
upload: testdata/hoge/yq_linux_amd64 to s3://my-bucket/hoge/yq_linux_amd64
$ touch testdata/hoge/yq_linux_amd64
$ aws s3 sync --checksum-only --checksum-algorithm=SHA256 testdata s3://my-bucket
(no output)
Acknowledgements
I may be able to implement this feature request
This feature might incur a breaking change
CLI version used
aws-cli/2.21.0
Environment details (OS name and version, etc.)
exe/x86_64.opensuse-tumbleweed.20241107
The text was updated successfully, but these errors were encountered:
Hi @tgpfeiffer, thanks for reaching out and for your patience. This feature has been requested many times, and it's definitely one we see value in. #599 is the tracking issue for s3 sync issues, including checksum comparison. I'll be closing this as a duplicate, and we'll be continuing to track this in the other issue. Thank you for this feature request.
This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
Describe the feature
This is the same as #8377 and #7011:
aws s3 sync
should be able to detect files that need synchronization based on the checksum stored in S3, not size/timestamp.Both issues were closed in favor of #6750 but #6750 only goes half the way: it uploads the checksum as metadata, but doesn't take it into account when computing the sync candidates.
Use Case
A prototypical use case looks like
Because the files are generated in CI, they always have the current timestamp, so all of them will be synced to S3, even though only few of them may have changed.
--size-only
is not a viable alternative, as it skips, for example, typo fixes that often don't change the file size.Proposed Solution
Add a flag
--checksum-only
or the like. If that flag is present, rather than retrieving file timestamp and size from S3 and comparing it to the local values, retrieve the checksum that was stored during the previous upload using--checksum-algorithm
and compare it to the locally computed value.Other Information
Current behavior:
Desired behavior:
Acknowledgements
CLI version used
aws-cli/2.21.0
Environment details (OS name and version, etc.)
exe/x86_64.opensuse-tumbleweed.20241107
The text was updated successfully, but these errors were encountered: