How the image crops are obtained in inference #35

avermilov · 2025-02-18T09:58:21Z

Thank you for the paper and the code! I have one question regarding the process of how the image crops are obtained during inference. As far as I understand, the goal is to extract num_crops (which is a hyperparameter) of evenly spaced crops of the original image. In the code, you achieve this using two consecutive unfold operations on the H and W, after which you take num_crops evenly spaced out indices in the crops array. However, when visualizing the crops themselves, it seems that they often result in having similar crops of the same area, while other areas might not have a single crop representing them (crops 1 and 3 are the same area, no crops capture the table). Is this behaviour expected? I understand that it can partially be fixed by setting a bigger num_crops, however, the effect of non-evenly spaced crops will still persist.

zwx8981 · 2025-02-19T09:44:52Z

@avermilov Hi, thanks for your insightful question. Yes, this is the expected behavior. Since CLIP requires a fixed input size of (224x224), in theory, the sampling step size could be adaptively determined based on the resolution of different image contents. However, this is more of an engineering problem. Moreover, our experiments have shown that the current setting is already sufficient for quality score prediction. When predicting high-resolution images (such as 4K images), the image can be first scaled down while keeping the aspect ratio.

avermilov · 2025-02-20T08:51:29Z

Got it, thank you for the quick response!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How the image crops are obtained in inference #35

How the image crops are obtained in inference #35

avermilov commented Feb 18, 2025

zwx8981 commented Feb 19, 2025

avermilov commented Feb 20, 2025

How the image crops are obtained in inference #35

How the image crops are obtained in inference #35

Comments

avermilov commented Feb 18, 2025

zwx8981 commented Feb 19, 2025

avermilov commented Feb 20, 2025