Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support image transformations #6

Open
dreamflasher opened this issue Apr 10, 2019 · 10 comments
Open

Support image transformations #6

dreamflasher opened this issue Apr 10, 2019 · 10 comments

Comments

@dreamflasher
Copy link

Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?

@bupticybee
Copy link

Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?

I guess an easy solution is just to first resize ,crop,rotation the normally decoded jpeg picture and then encode it again

@dreamflasher
Copy link
Author

That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.

@bupticybee
Copy link

That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.

I don't see why it can't be done in real time, the process decode normally -> data augmentation -> encode normally -> jpeg2dct should't take a lot of time.

Of course a more clever way is to somehow map the data augmentation process to the "input image" after jpeg2dect.

@dreamflasher
Copy link
Author

I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)

@bupticybee
Copy link

I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)

I understand the purpose of is to save time of decoding/encoding, during inference time. all the extra encoding and decoding due to data augmentation should only happen in training, since data augmentation should only be adopted in training. Therefore the extra time wasted in encoding/decoding
when inference is zero.

@dreamflasher
Copy link
Author

Speeding up training is relevant, and that's what I personally care about.

@bupticybee
Copy link

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct

ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

@kfirgoldberg
Copy link

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct

ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

Hi,
I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?

@bupticybee
Copy link

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct
ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

Hi,
I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?

No, sorry. Zero progress

@saitarslanboun
Copy link

Speeding up training is relevant, and that's what I personally care about.

Dear @dreamflasher, encoding/decoding is not the part of the process speeding up the inference. DCT is already a compressed form. In order to compress the given image to an equivalent size that DCT has, you have to use a reasonable amount of convolutional layers which require heavy computational need. The main idea of this work is using already compressed image form to avoid 1st and 2nd blocks in ResNet which involves lots of Convolutional layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants