-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reprocessing the same file over and over again #109
Comments
What I miss in your description is the context in which you think (or figured out) that the file will be "reprocessed again".
Regardless of whether I think it's a good idea, adding an event is basically no problem for me as extensibility in the heart spirit of TYPO3. |
I set the threshold to 50K. I have a file, which is 100K in size. Each time when I run the command, this line is executed on the same set of files: $tempFileInfo = $gifCreator->imageMagickConvert($fileName, $destExtension, '', '', $imParams, '', $options, true); So the file gets resized with ImageMagic on each scheduler execution taking CPU time. Then there is this check: } elseif (!$isRotated && filesize($tempFileInfo[3]) >= $originalFileSize - 10240 && $destExtension === $fileExtension) {
// Conversion leads to same or bigger file (rounded to 10KB to accommodate tiny variations in compression) => skip!
@unlink($tempFileInfo[3]);
$tempFileInfo = null;
} And the result of the resizing is discarded. What is interesting: it tries to scale to the same width/height, with each scheduler execution. The result is always discarded because the file was resized already before. |
OK, so the context is that it runs over and over again for the same files when the scheduler task for batch processing is invoked. |
Yes. Our editors can upload 6000x4000 huge images, so we need to run the task regularly. Manual execution is not an option because it is a closed system with no console and there are a lot of sites like this. Thus, the scheduler runs daily. It would be good to optimize the resizing process :) |
and just to get it, why are you not resizing on the fly during upload? Your editors are pushing those files via FTP or some external system? |
We already have many sites with a lot of files. The problem exists for quite soime time. We could run the job once and rely on the resizing on the fly for new files. Is this what you suggest? 🤔 |
Yes this is what I suggest. In my experience, unless there is a misconfiguration of the GFX part and you don't see that quickly enough, or you have misconfigured how to resize, the resizing on-the-fly while uploading works fine. This makes the upload slightly slower, that's true, but that usually doesn't really bother anyone. I typically run the batch processing only once and if I see that somehow the resizing was not properly configured or I had GFX problems and my editors just uploaded bunch of huge photos. But I really don't run that task on a daily basis as upload is only possible through the Backend (or through custom Frontend plugin but in that case, if you do it correctly, meta-data extraction, resizing and everything works fine as well). |
Thank you! We will do it like this. 🙇 Please, feel free to close the ticket (or add an event using this ticket). |
If the file passed the
thereshold
check, it will always be reprocessed again but the result will be discarded if the new size is the same or larger. It could save resources if the file was marked in some way to avoid reprocessing. The goal is to avoid running resizes on hundreds of files that still exceed the threshold. Executing ImageMafic is VERY resource consuming!I can imagine it by using file metadata, a new field with a hash that is computed as the following:
When running, get the metadata, compute the hash and compare with the stored hash. If they match, do not attempt to resize it.
Possible question: why not fine tune the threashold?
Answer: because the threshold is about file size and file size greatly depends on the content of the image. For example, a 3000x2000px solid white jpeg can be smaller in bytes than 1000x600px jpeg of the sea or city.
What do you think?
If you do not think it is a good idea, would you consider at least adding an event to the beginning of the
ImageResizer::processFile()
to let other extensions decide if the file should be processed or not?The text was updated successfully, but these errors were encountered: