-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fork implementing multi-region spatial control #46
Comments
Nice work on translating my old code! I think that |
Yeah, I agree, I think there might be an issue with how I set up the |
PyTorch has a few differences with Torch7, due to the Autograd feature, and because of Python. I suspect that the issue lies with something in the MaskedStyleLoss function. This example may help figure out what is causing the issue: https://discuss.pytorch.org/t/runtimeerror-trying-to-backward-through-the-graph-a-second-time-but-the-buffers-have-already-been-freed-specify-retain-graph-true-when-calling-backward-the-first-time/6795/29 |
Ok, I fixed the problem with the graph, just had to detach the masked gram before operating with it. But the problem with the second style not having any effect persists. One clue is that there seems to be a large magnitude difference between the two MaskedStyleLoss for each of the two styles. Will keep investigating. |
I've fixed the bug with the second style and now everything works properly! See result. Need to add a little bit of documentation to the README, and I can also send a PR to you if you'd like here. To my eye, I think it still needs a bit of work. In the paper the authors gave some nuances to the generation of masks besides for simple bilinear scaling. I am also trying to figure out how to make non-discrete continuous masks to do transitioning between styles but finding this isn't as straightforward as I thought it would be! |
@genekogan Looks good! I'm not sure about the licensing issue with translated code with respect to the segmentation code from Lua. That could conflict with neural-style-pt license, and as such that could mean that it's better to list it in the wiki, like how the original was linked to in the neural-style wiki. I also wonder if we can simplify the code and improve how it looks/works? Python is a lot more powerful than Lua, and opens up possibilities for improving the code. |
Sure, I am fine with listing it on the wiki instead. Yeah, I'd definitely like to improve the code. One thing I'm currently struggling with is blending or transitioning between masks by making them continuous instead of discrete. I've implemented this in a separate branch but it produces poor results in the boundary areas. I wrote about this in more detail in this issue. I'd be curious if you have any ideas. |
@genekogan Are making sure that that TV weight is set to 0 in your experiments? |
Yes, setting tv_weight to 0 has not really helped. I also just started a new branch which replaces gram loss with @pierre-wilmot's histogram loss as described here. I'm getting interesting results with it, but the big gap in the middle remains. I'm pretty stumped. I might start trying more hacky approaches. |
@genekogan I was actually recently looking into histogram loss myself, after seeing the results from: https://arxiv.org/pdf/1701.08893.pdf. It was used in deep-painterly-harmonization, and it seems like a better idea than performing histogram matching before/after style transfer. deep-painterly-harmonization seems to implement the histogram loss as a type of layer alongside content and style loss layers. I'm not sure what's causing the gap in middle with your code. I haven't come across the issue myself before, so I have no idea what could be going wrong in your code. |
Also, on a bit of an unrelated note, have you tried to get gradient normalization from the original Lua/Torch7 code working in PyTorch? I did figure out that it's more like gradient scaling: #26 (comment), but I'm beginning to think that it's not possible in PyTorch without a ton of hacky workarounds. |
The histogram approach is getting interesting aesthetic results, and seems to work well in combination with normal Gram losses. Pierre also uses Adam to optimize it instead of L-BFGS, which didn't work well in the original neural-style, but maybe could if the hyper-parameters are fine-tuned just right. Yeah, I'm stumped on the gray region. I don't think there's a bug in the code... I think maybe it's just the expected behavior when you try to spatially mix gradients. I'm still researching alternatives. I have not tried implementing normalized gradients. My recollection from the original neural-style was that it did not produce dramatic differences, but maybe I am not aware of cases where it might be useful? |
@genekogan I do recall seeing some issues with gradients when using masks that had very small/thin regions surrounded by other regions. Maybe something like that could be being exaggerated by your code? Gradient normalization in neural-style worked extremely well with higher content and style weight values. (ex: jcjohnson/neural-style#240 (comment), though I'd suggest values closer to cw 50-500, sw 4000-8000) I've also seen users on Reddit talking about how it made heavily stylized faces look better. |
Torch7 had really bad default parameter values for the Adam optimizer, which is why neural-style had a parameter for the learning rate. PyTorch's Adam optimizer seems use better default parameter values, though I haven't played around with different values for it (though the values are really similar, if not the same as the ones I used in modified neural-style versions). Do you think the histogram results work better as their own separate layers, or as part of the style layers (like in your code, I think)? |
Yes, I have it in the same layers, which is how Pierre did it. I don't know any reason why it might do better in different layers. I do need to find better values for the strength coefficients, as the histogram loss at the values it has now overwhelms the other loss terms. Pierre wrote in his paper that the best results come from using both histogram and Gram loss together. |
I implemented histogram loss as it's own layer type alongside content and style layers. The code can be found here: https://gist.github.com/ProGamerGov/30e95ac9ff42f3e09288ae07dc012a76 Histogram loss example output on the left, control test (no histogram loss) on the right: There are more examples in the comments of the gist. |
Super nice, I commented further in the gist. |
Another note about transitional blending problem. I e-mailed Leon Gatys about it and he suggested that since covariance loss seems to reduce the smudging effect more than Gram, to try using covariance loss on the lower layers (where the differences between the styles are greatest) and using Gram on the higher layers to preserve better style reconstruction. Going to try that next. |
@genekogan I replied to your comment in the gist regarding weights. Someone also already implemented covariance loss in neural-style-pt here: #11, so that should help with covariance loss part of your plan. |
It looks like there may an issue with larger image sizes when testing the histogram layers. I don't know enough about C++ to decode it.
|
It looks like maybe it originates in |
I just updated my histogram_loss branch with your Histogram loss module, and made it support masking in the same way the Style Loss does. I also added a covariance loss option for the normal StyleLoss module. So far with limited tests, histogram loss does not seem to do much to fix the blending problem. I'm going to do some tests combining the style loss parameters and see if I can either improve that issue somehow or at least get the general style transfer to look nicer. |
@genekogan Has covariance loss made any difference with the lower layers? For the histogram size problem, we could potentially try to recreate the It also looks like Pierre downscales tensors before running them through the
|
I think in that block, he is actually just doing a 3-part multiscale generation. Capture style at 1/4 scale, generate, upsample it 2x, then capture style at 1/2, generate on top of that, upsample 2x, capture style at 1, generate one last time at 4x original resolution. |
@genekogan I think you're right. As for the A & B are equal to each other according to Python:
Edit: This is exactly what is happening to us. Resizing the height and width of tensors for the histogram loss layer did not seem to resolve the issue. |
@genekogan I translated my linear-color-transfer.py to PyTorch: https://gist.github.com/ProGamerGov/684c0953395e66db6ac5fe09d6723a5b The code expects both inputs to have the same size, and it does not change the BGR images to RGB or un-normalize them (though neither of those things seem to influence the output). Hopefully we can use it to create some sort of histogram matching loss function and replace the bugged CUDA code? |
I got linear-color-transfer fully working inside neural-style-pt: https://gist.github.com/ProGamerGov/923b1679b243911e71f9bef4a4bda65a The histogram class is used to perform the standard histogram matching that's normally done via linear-color-transfer, and it's also used by the histogram loss function. The histogram loss doesn't work well yet, and quickly becomes The |
Using these histogram parameters:
Histogram loss & histogram matching preprocessing example output on the left, control test (no histogram loss) & histogram matching preprocessing on the right: And the histogram loss output without histogram matching preprocessing : |
So, oddly enough this code replicates the results from using the CUDA histogram matching code extremely well:
With |
So, MSELoss() is implemented as: Both images had I previously used this code in neural_style_deepdream.py to implement simultaneous style transfer and DeepDream:
So, it looks like my code in my above comment is essentially a DeepDream layer (not a histogram matching loss layer) , and those DeepDream hallucinations provide detail for the style transfer process to attach to on bland regions like the sky in the example input image. @genekogan I wonder if this could be used as a possible solution to your blending problem? |
@ProGamerGov your results look amazing! But I'm unable to reproduce it... Using your code directly from https://gist.github.com/ProGamerGov/923b1679b243911e71f9bef4a4bda65a, with:
I get some numerical instability: When I switch over the
It works, but I only see: Not sure what I might be doing wrong. It shows the histogram losses alright:
I'm not sure if perhaps I duplicated your code incorrectly or not. But nevertheless it looks really promising. Regarding the deepdream idea, I'll have to try that separately to see if it solves the problem. Additionally I'd love to integrate this into my masked style transfer to see how the histogram loss works in tandem with that, once I can replicate the results you are getting. |
@genekogan I only had Cholesky errors with relu1_1, and didn't experience any other is In addition to the histogram parameters, I've been mostly using these other parameters as a default:
Any other parameters are just the neural-style-pt default values. As for the DeepDream/Mean code, I accidentally included
I made a neural-style-pt gist that implements the mean loss layer type: https://gist.github.com/ProGamerGov/9c2aa72f21f0f22c64d0a6ee7294cf3c |
On a bit of an unrelated note, I tried to implement a tiled gradient calculation in an attempt to lower GPU usage. On the left is without trying to hide the tile borders, and on the right is the result of randomly shifting the image before tiling: If the tile coordinates don't cover the entire image, you get this effect: Sadly, it doesn't seem like my code reduces memory usage yet. I was loosely following the official Tensorflow DeepDream guide and implemented a few DeepDream related functions for things like rolling/jitter and resizing. The code can be found here: https://gist.github.com/ProGamerGov/e64fcb309274c2946f5a9a679ed45669/ae37552a77c4b67c0eb021a6d7237868ecb464e4 Somehow VakonS was able to modify Justin Johnson's neural-style in a way that uses to tiling to create larger images. Edit: I think overlapping tiles is a better solution than random shifting.
|
The CUDA histogram matching code now works with images larger than 512px: https://gist.github.com/ProGamerGov/30e95ac9ff42f3e09288ae07dc012a76, now that the bug has been fixed: pierre-wilmot/NeuralTextureSynthesis#1 I have also managed to implement tiling that is similar to VaKonS' neural-style modifications: https://gist.github.com/ProGamerGov/e64fcb309274c2946f5a9a679ed45669, though it currently doesn't work with more than 2x2 tiles. I have also constructed a standalone DeepDream project using neural-style-pt as a base: https://gist.github.com/ProGamerGov/a416cc21a9ce454fdc160ad846410237 |
First of all, this is a great repo! It seems a bit faster and more memory efficient than the original lua-based neural-style.
I've made a fork of this repo trying to add masked style transfer as described by Gatys, and going off of the gist you wrote for the lua version.
I've almost got it working, but my implementation is suffering from two bugs. The first is that, testing with two style images and segmentations, my implementation seems only to get gradients for the first mask but not the second.
So for example, the following command:
python neural_style.py -backend cudnn -style_image examples/inputs/cubist.jpg,examples/inputs/starry_night.jpg -style_seg examples/segments/cubist.png,examples/segments/starry_night.png -content_seg examples/segments/monalisa.png -color_codes white,black
produces the following output:
where the first style (cubist) and corresponding segmentation get good gradients and works in the mask provided, but the second mask (starry night) has little or no gradient signal.
By simply swapping the order of the style images, as in:
python neural_style.py -backend cudnn -style_image examples/inputs/starry_night.jpg,examples/inputs/cubist.jpg -style_seg examples/segments/starry_night.png,examples/segments/cubist.png -content_seg examples/segments/monalisa.png -color_codes white,black
I get the opposite effect where only the starry night style works and the cubist style in its mask is not there.
I have been trying to debug this, checking the masks, and everything seems right to me, and I can't figure out the problem. This was almost a pytorch mirror of what you made in your gist, which does appear to work fine. I'm not sure if there's some typo I'm missing or something deeper.
Additionally,
loss.backward()
without keeping the gradients withretain_graph=True
produces a runtime error (RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
), which makes me think I setup the graph wrong.If you are able to see what I'm doing wrong so that we can fix it, I'd love to see this implemented in PyTorch. I think it would be a really nice addition to the repo.
The text was updated successfully, but these errors were encountered: