-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve TTL of cached documents #2676
Retrieve TTL of cached documents #2676
Conversation
a3c1531
to
e286186
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm going against the codebases convention here. feel free to point out 👍
thoth/storages/ceph.py
Outdated
except botocore.exceptions.ClientError as exc: | ||
if exc.response["Error"]["Code"] in ("404", "NoSuchKey"): | ||
raise NotFoundError("Failed to retrieve object, object {!r} does not exist".format(object_key)) from exc | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to catch exceptions at that level ?
Since we're re-raising anyway (-> so the caller has to handle it itself), I would drop that.
thoth/storages/ceph.py
Outdated
def retrieve_document_last_modification_date(self, object_key: str) -> datetime: | ||
"""Retrieve the last modification date of a document from S3.""" | ||
try: | ||
return self._s3.Object(self.bucket, f"{self.prefix}{object_key}").get()["LastModified"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the previous comment would make this a single line, maybe we could just drop the method and have the caller use ['LastModified'] on the object
?
thoth/storages/ceph_cache.py
Outdated
# No timezone is indicated to calculate the present datetime on purpose to get the correct time delta | ||
# regardless of the environment where the data is retrieved from (production or stage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# No timezone is indicated to calculate the present datetime on purpose to get the correct time delta | |
# regardless of the environment where the data is retrieved from (production or stage) | |
# Uses UTC time to be environment agnostic (no timezone) |
I don't think we should point out specifics environments, thoth could be deployed on others.
thoth/storages/ceph.py
Outdated
def retrieve_document_last_modification_date(self, object_key: str) -> datetime: | ||
"""Retrieve the last modification date of a document from S3.""" | ||
return self._s3.Object(self.bucket, f"{self.prefix}{object_key}").get()["LastModified"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should probably not have a method for something so specific.
What about
def retrieve_document_last_modification_date(self, object_key: str) -> datetime: | |
"""Retrieve the last modification date of a document from S3.""" | |
return self._s3.Object(self.bucket, f"{self.prefix}{object_key}").get()["LastModified"] | |
def retrieve_document_attr(self, object_key: str, attr: str) -> datetime: | |
"""Retrieve the given attribute of a document from S3.""" | |
return self._s3.Object(self.bucket, f"{self.prefix}{object_key}").get()[str] |
and calling that with "LastModified" ?
Maybe I'm worried over nothing. I'm not sure of the correct ratio specialize/generalize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(leaving the method out completely like I initially suggested does not seem practical)
@VannTen is this ready to be merged? |
I think we could generalize `retrieve_document_last_modification_date`
as mentioned in the review, otherwise, the rest looks OK to me.
|
@VannTen thanks for the reviews, changes added 👍 |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: VannTen The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Related Issues and Dependencies
Related to #2666
This introduces a breaking change
This should yield a new module release
This Pull Request implements
Implement a method to retrieve the time to live of a cached document, considering S3 bucket timezones for different environments.
This implementation is based on the current documents caching time of 2 hours.