-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store screenshots as attachments in NDB. #4473
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
# Copyright 2024 Google Inc. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License") | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import logging | ||
from google.appengine.api import images | ||
from google.cloud import ndb # type: ignore | ||
from typing import Tuple | ||
|
||
|
||
RESIZABLE_MIME_TYPES = [ | ||
'image/png', 'image/jpg', 'image/jpeg', 'image/gif', 'image/webp', | ||
] | ||
THUMB_WIDTH = 250 | ||
THUMB_HEIGHT = 200 | ||
MAX_ATTACHMENT_SIZE = 10 * 1024 * 1024 # 10 MB | ||
|
||
|
||
class Attachment(ndb.Model): | ||
"""Attaches files, such as screenshots, to a feature entry.""" | ||
feature_id = ndb.IntegerProperty(required=True) | ||
created_on = ndb.DateTimeProperty(auto_now_add=True) | ||
content = ndb.BlobProperty(required=True) | ||
mime_type = ndb.StringProperty(required=True) | ||
thumbnail = ndb.BlobProperty() | ||
is_deleted = ndb.BooleanProperty(default=False) | ||
|
||
|
||
class UnsupportedMimeType(Exception): | ||
pass | ||
|
||
class AttachmentTooLarge(Exception): | ||
pass | ||
|
||
|
||
_EXTENSION_TO_CTYPE_TABLE = { | ||
# These are image types that we trust the browser to display. | ||
'gif': 'image/gif', | ||
'jpg': 'image/jpeg', | ||
'jpeg': 'image/jpeg', | ||
'png': 'image/png', | ||
'webp': 'image/webp', | ||
'txt': 'text/plain', | ||
} | ||
|
||
|
||
def guess_content_type_from_filename(filename: str) -> str: | ||
"""Guess a file's content type based on the filename extension.""" | ||
ext = filename.split('.')[-1] if ('.' in filename) else '' | ||
ext = ext.lower() | ||
return _EXTENSION_TO_CTYPE_TABLE.get(ext, 'application/octet-stream') | ||
|
||
|
||
def store_attachment(feature_id: int, content: bytes, mime_type: str) -> str: | ||
""""Store some data for an attachment. Return its URI.""" | ||
check_attachment_size(content) | ||
check_attachment_type(mime_type) | ||
logging.info('Storing attachment with %r bytes', len(content)) | ||
|
||
attachment = Attachment( | ||
feature_id=feature_id, | ||
content=content, | ||
mime_type=mime_type) | ||
|
||
if mime_type in RESIZABLE_MIME_TYPES: | ||
# Create and save a thumbnail too. | ||
thumb_content = None | ||
try: | ||
thumb_content = images.resize(content, THUMB_WIDTH, THUMB_HEIGHT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Question: Will this code maintain the aspect ratio of the original image if the user uploads an image with different proportions than 250x200? For example, if someone uploads a square image, will it be distorted when resized to 250x200? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nope. It will be somewhat distorted. After I get the backend PRs merged, I'll start experimenting with the frontend and coordination with Yann to see how some sample images will look. My thinking at this point is that all the thumbnails should be the same size so that they look nice when there are multiple, so we might end up with some combination of scaling and cropping. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A few quick alternatives to consider in case image dimensions change in the future.
If the images already have the same aspect ratio as 250x200, option 2 is the simplest. These approaches will make our code more robust to future changes. |
||
except images.LargeImageError: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You may want to handle large files at the handler level so that users can't upload large files into memory in the first place. https://flask.palletsprojects.com/en/3.0.x/patterns/fileuploads/#improving-uploads
Or you can do both. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant to put this comment on the check_attachment_size method that currently checks the size. |
||
# Don't log the whole exception because we don't need to see | ||
# this on the Cloud Error Reporting page. | ||
logging.info('Got LargeImageError on image with %d bytes', len(content)) | ||
except Exception as e: | ||
# Do not raise exception for incorrectly formed images. | ||
# See https://bugs.chromium.org/p/monorail/issues/detail?id=597 for more | ||
# detail. | ||
logging.exception(e) | ||
if thumb_content: | ||
attachment.thumbnail = thumb_content | ||
|
||
attachment.put() | ||
return attachment | ||
|
||
|
||
def check_attachment_size(content: bytes): | ||
"""Reject attachments that are too large.""" | ||
if len(content) > MAX_ATTACHMENT_SIZE: | ||
raise AttachmentTooLarge('Attachment size %r is too large' % len(content)) | ||
|
||
|
||
def check_attachment_type(mime_type: str): | ||
"""Reject attachments that are of an unsupported type.""" | ||
if mime_type not in _EXTENSION_TO_CTYPE_TABLE.values(): | ||
raise UnsupportedMimeType( | ||
'Please upload an image with one of the following mime types:\n%s' % | ||
', '.join(_EXTENSION_TO_CTYPE_TABLE.values())) | ||
|
||
|
||
def get_attachment(feature_id: int, attachment_id: int) -> Attachment|None: | ||
"""Return attachment, if feature_id matches, and attachment is not deleted.""" | ||
attachment = Attachment.get_by_id(attachment_id) | ||
if (attachment and | ||
attachment.feature_id == feature_id and | ||
not attachment.is_deleted): | ||
return attachment | ||
|
||
return None | ||
|
||
|
||
def get_attachment_uri(attachment: Attachment) -> str: | ||
"""Return the URI path that will serve tis attachment.""" | ||
uri = '/feature/%r/attachment/%r' % ( | ||
attachment.feature_id, attachment.key.integer_id()) | ||
return uri | ||
|
||
|
||
def mark_attachment_deleted(attachment: Attachment) -> None: | ||
"""Mark an attachment as deleted so that it will no longer be served.""" | ||
attachment.is_deleted = True | ||
attachment.put() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Copyright 2024 Google Inc. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License") | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import testing_config # Must be imported before the module under test. | ||
|
||
from unittest import mock | ||
|
||
from internals import attachments | ||
|
||
|
||
class AttachmentsTests(testing_config.CustomTestCase): | ||
|
||
def setUp(self): | ||
self.feature_id = 12345678 | ||
|
||
def tearDown(self): | ||
for attach in attachments.Attachment.query().fetch(None): | ||
attach.key.delete() | ||
|
||
def test_guess_content_type_from_filename(self): | ||
"""We can guess mime type based on filename extension.""" | ||
guess = attachments.guess_content_type_from_filename | ||
self.assertEqual(guess('screenshot.gif'), 'image/gif') | ||
self.assertEqual(guess('screenshot.jpg'), 'image/jpeg') | ||
self.assertEqual(guess('screenshot.jpeg'), 'image/jpeg') | ||
self.assertEqual(guess('screenshot.png'), 'image/png') | ||
self.assertEqual(guess('screenshot.webp'), 'image/webp') | ||
|
||
self.assertEqual(guess('screen.shot.webp'), 'image/webp') | ||
self.assertEqual(guess('.webp'), 'image/webp') | ||
self.assertEqual(guess('screen shot.webp'), 'image/webp') | ||
|
||
self.assertEqual(guess('screenshot.pdf'), 'application/octet-stream') | ||
self.assertEqual(guess('screenshot gif'), 'application/octet-stream') | ||
self.assertEqual(guess('screenshotgif'), 'application/octet-stream') | ||
self.assertEqual(guess('gif'), 'application/octet-stream') | ||
self.assertEqual(guess(''), 'application/octet-stream') | ||
|
||
def test_store_attachment(self): | ||
"""We can store attachment content.""" | ||
actual = attachments.store_attachment( | ||
self.feature_id, b'test content', 'text/plain') | ||
attach_id = actual.key.integer_id() | ||
retrieved = attachments.Attachment.get_by_id(attach_id) | ||
self.assertEqual(retrieved.feature_id, self.feature_id) | ||
self.assertEqual(retrieved.content, b'test content') | ||
self.assertEqual(retrieved.mime_type, 'text/plain') | ||
|
||
def test_check_attachment_size(self): | ||
"""We can check the size of the attachment content.""" | ||
attachments.check_attachment_size(b'small content') | ||
|
||
large_content = b'very large content ' * 1024 * 1024 | ||
with self.assertRaises(attachments.AttachmentTooLarge): | ||
attachments.check_attachment_size(large_content) | ||
|
||
def test_check_attachment_type(self): | ||
"""We can check the type of the attachment.""" | ||
attachments.check_attachment_type('image/gif') | ||
attachments.check_attachment_type('image/jpeg') | ||
attachments.check_attachment_type('image/png') | ||
attachments.check_attachment_type('image/webp') | ||
|
||
with self.assertRaises(attachments.UnsupportedMimeType): | ||
attachments.check_attachment_type('application/octet-stream') | ||
|
||
with self.assertRaises(attachments.UnsupportedMimeType): | ||
attachments.check_attachment_type('video/mpeg') | ||
|
||
def test_get_attachment__found(self): | ||
"""We can retrive an attachment, with checking for the proper feature.""" | ||
stored = attachments.store_attachment( | ||
self.feature_id, b'test content', 'text/plain') | ||
attach_id = stored.key.integer_id() | ||
|
||
actual = attachments.get_attachment(self.feature_id, attach_id) | ||
|
||
self.assertEqual(actual.feature_id, self.feature_id) | ||
self.assertEqual(actual.content, b'test content') | ||
self.assertEqual(actual.mime_type, 'text/plain') | ||
|
||
def test_get_attachment__not_found(self): | ||
"""We return None when there is no such attachment.""" | ||
# Nothing is stored | ||
actual = attachments.get_attachment(self.feature_id, 99999) | ||
self.assertEqual(actual, None) | ||
|
||
def test_get_attachment__reject_probing(self): | ||
"""We return None when an attempt is made to find all attachments.""" | ||
stored = attachments.store_attachment( | ||
self.feature_id, b'test content', 'text/plain') | ||
attach_id = stored.key.integer_id() | ||
|
||
# The attachment is not part of any other feature. | ||
actual = attachments.get_attachment(self.feature_id + 1, attach_id) | ||
|
||
self.assertEqual(actual, None) | ||
|
||
def test_get_attachment__deleted(self): | ||
"""We return None when attempting to get a deleted attachment.""" | ||
stored = attachments.store_attachment( | ||
self.feature_id, b'test content', 'text/plain') | ||
attachments.mark_attachment_deleted(stored) | ||
|
||
# The attachment was marked deleted. | ||
actual = attachments.get_attachment( | ||
self.feature_id + 1, stored.key.integer_id()) | ||
|
||
self.assertEqual(actual, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The images API is part of the legacy bundled services. I know we use existing appengine legacy services but this will be the first time we use the images API. Instead of adding another dependency on the legacy bundled service, we should consider using one of the libraries they mention in their migrating off legacy bundle services document
At first glance, Pillow has a similar API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I'll try to follow up with a PR to switch to Pillow and use max-aspect.