Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup spec.cleanPolicy: Delete does not fully delete objects in versioned S3 buckets #6066

Open
plevart opened this issue Feb 10, 2025 · 0 comments

Comments

@plevart
Copy link

plevart commented Feb 10, 2025

Bug Report

What version of Kubernetes are you using?

What version of TiDB Operator are you using?
1.6.1

What did you do?
I configured BackupSchedule with Backup template which specifies S3 target and cleanPolicy: Delete
Backup CRs are deleted, but corresponding objects in S3 bucket are either:

  • just marked as deleted (a delete marker object version is added) when versioning in the bucket is enabled
  • deleted, but instead of them, a delete marker object version is added when versioning is suspended

The behavior of object deletion is described by this document:

https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObject.html

It doesn't matter whether I specify cleanOption.disableBatchConcurrency true or false, the behavior is the same. This suggests that tidb operator is using DeleteObject (for individual object) or DeleteObjects (for batch) without versionId. This fully deletes the objects only when bucket versioning is disabled. The problem is that a lot of S3 services (like MinIO) only support versioned buckets and what user can do is only choose between versioning: enabled or versioning: suspended. So when I choose versioning: suspended the effect is that tidb-operator by deleting the objects without specifying their versionId, actually replaces them with delete markers. So in effect, the hierarchy of objects remains, just real data is gone.

It would be nice if TiDB operator supported such scenarios in the following way:

  • when listing objects in some directory (with some prefix), it would request versions to be returned. Each returned object would be accompanied with versionId attribute.
  • when deleting objects (either one-by-one or in bulk), besides key(s), versionId(s) should also be specified. This would have the effect of actually deleting all versions of some key and not leaving any delete marker behind.

This logic should work with buckets where versioning is disabled too, since in such buckets, listing objects never returns versionId and deleting without versionId actually deletes the object. But if you want to play safe, another attribute in cleanOption might be necessary. For example:

cleanOption:
deleteVersions: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant