Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recomposite color correct improvements (broken out of Small Improvements) #597

Open
willhsmit opened this issue Feb 17, 2025 · 10 comments
Open
Labels
Feature New feature or request

Comments

@willhsmit
Copy link
Contributor

Feature Idea

Opening a separate feature thread for the 'Recomposite color correct' for flux fill feature mentioned in #550. Original description:

"recomposite color correct" to make flux fill work well -- line up pixels, select those with mask=0, ensure there's more than a minimum pixel count, and then measure what color correction is needed, take the average, and apply it to the full image -- see https://discord.com/channels/1243166023859961988/1243185862234210389/1327390619068530729 for my attempt at this. Unfortunately flux vae is dumdum inconsistent so even that's not good enough. probably need a full multiaxis gradient for the color shift. Might be a smarter way to do it? idk

The discord link has code that compares the old and new pixels outside the mask, computes the average difference between them in HSV, and applies that uniformly to the pixels inside the mask. The discord also suggests some areas of improvement:

problem is, the color shift from the flux vae is inconsistent. I tried to do a global color correct, but if you fix the left, it'll just be wrong a different direction on the right. Maybe a, like, multi-dimension gradient from all edges would do nicely? ie do the shift estimate 4 or 8 times, for each different direction, and then smoothblend between them. I think the real solution is just throw up your hands and use a blurred mask, or don't recomposite at all :(

Other

No response

@willhsmit willhsmit added the Feature New feature or request label Feb 17, 2025
@willhsmit
Copy link
Contributor Author

First thing, I ported mcmonkey's original code from the discord into a test fork (https://github.com/willhsmit/SwarmUI) to test it out.

Here's an example of it working well on an image where the fill boundary is mostly same color - you can see an ugly seam around the 'sample' text in Before, that more or less goes away with correction.

Before:

Image

After:

Image

And an example of it working less well where the fill area has a mixture of white, purple, and black.

Before:

Image

After:

Image

@willhsmit
Copy link
Contributor Author

willhsmit commented Feb 17, 2025

The original proposed solution to this was some version of using the gradient - modeling the diff not as a uniform value but as some kind of function over space to a diff, trying to blend in the values from different sides with a gradient.

I think this could get hairy for a couple reasons:

  • The masked area could have a pretty complex non-convex shape and there are a lot of edge cases.
  • The colors along the boundaries can also have a pretty complicated structure if the mask edge is drawn along anything with structure.
  • Even if we get the correction along the boundaries to look seamless, we might not be correcting new objects in the center of the image correctly. For example, if we're lightening the white along the boundary to be the right shade of white, we might end up unnaturally lightening or desaturating a dark object on the center of the mask.

One thing that occurs to me is that the VAE for flux fill might be producing mismatched colors in a consistent way. I did some spot checks around a boundary, and it looks like:

  • The VAE tends to make dark objects lighter and light objects darker
  • The VAE tends to lower saturation, especially for highly saturated objects
  • Hue doesn't change much

Particularly effect 1 made me wonder if instead modelling the diff as a uniform offset to HSV, we might get better results from modelling the diff as a linear function on the source pixel HSV.

I did an experiment with a linear color correction in my fork. It's pretty similar to the uniform correction on the mask with a uniform boundary color (the dark letters in the center of the image are a little darker than uniform correction):

Image

On the mixed boundary color, it does a better job than uniform correction on the white and black areas of the image, although the seam is still pretty obvious on the purple part of the image.

Image

I think one issue to improve here might be for very dark pixels where V is low, H and S vary pretty wildly. Some approaches there might be:

  • weight dark pixels less, at least for H and S
  • Try doing something in RGB - in the current implementation, I'm not sure that works, because 3 linear functions on R,G,B don't represent "Saturation is consistently a bit low" well. But maybe a 3x3 matrix on RGB might be able to represent that.

It might also be the case that the way Flux VAE affects colors might be so consistent that instead of estimating the color correction function at runtime, the code could have a preset correction function; this would make correction less dependent on how much variation is in the masked area, would make it easier to have a more complicated than linear function, and would open the possibility of correcting non-inpainted images like pure t2i generations.

@willhsmit
Copy link
Contributor Author

The linear color correction method produced pretty strong modifications to Hue. I tried defaulting Hue to a flat linear correction, and then defaulting hue to no correction at all, and I think leaving hue out of the correction process produces the best results.

I haven't tested 'Uniform correction to SV, no correction to H' but 'Linear correction to SV, no correction to H' looks pretty smooth:

Image

@willhsmit
Copy link
Contributor Author

Some vague ideas for further experiments within the linear function include:

  • Maybe clamping the slope of the linear function to a narrow range like (0.95,1.05), with the intent of limiting the ability of dark pixels (where small rgb changes produce wild saturation swings) to raise the saturation adjustment too much.
  • Maybe filtering or downweighting dark pixels in the computation of the saturation function .
  • Exploring a matrix where S and V can be terms in each other's correction.

@willhsmit
Copy link
Contributor Author

willhsmit commented Feb 19, 2025

I made some tests where:

  • The original image being infilled is already generated by flux. I think the correction is still useful there; there are seams at the mask join from the VAE.

Uncorrected: (The mask region is from the neck down to a bit above the elbow)

Image

Linear:

Image

  • The image is a t2i with a segment, not an inpaint (a cartoon image of a woman with blonde hair standing in the middle of a wide open street. She is smiling and waving at the camera, with beautiful sunlight glinting through her blonde hair. <segment:face and hair> extremely detailed close up cartoon image of a woman with shiny blonde hair). The effect is pretty hard to spot here.

Uncorrected:

Image

Linear:

Image

@mcmonkey4eva
Copy link
Member

The original image being infilled is already generated by flux. I think the correction is still useful there; there are seams at the mask join from the VAE.

Yes for reference this is actually the original reason for this topic: Flux VAE does not color-match itself, it has global-inconsistency.

If you drop mask blur to 0 you can see it much more clearly

@willhsmit
Copy link
Contributor Author

I did an experiment with trying to run hue through diff = (((diff + 0.5) % 1.0) - 0.5) to check for cases where, say, source is 0.01 and dest is 0.99. This did reduce the size of the hue correction, but it still looks worse in tests than leaving hue uncorrected.

@mcmonkey4eva
Copy link
Member

Yeah leaving hue uncorrected is probably the right move. It's mostly value that needs adjustment I think, at least for Flux VAE issues, and maybe a small touch of saturation

@mcmonkey4eva
Copy link
Member

Pulled the current state of the PR. In a quick and dirty test, it seems to still not quite be good enough to fix Flux being silly.

Tried a photo of a woman smiling in the middle of a street and waving at the camera flux dev seed 1, then manual input a hard circle no blur over the face with a beautiful woman's face, ultra HD

Uncorrected

Image

linear

Image

uniform

Image

In all cases, there's a visible "dark circle" where the selection was made.

Just letting Mask Blur run still is the best way to combat coloration mixups in inpaints :(

(Note ftr my setup here is intentionally unoptimal, a hard manual mask with blur off is basically worst case scenario, a blur or a generated segment mask will hide the color mismatch a decent bit, I'm intentionally forcing the color mismatch to show to test if this code can do much to fix it)

@willhsmit
Copy link
Contributor Author

Yes. As a practical matter, I think mask shrink grow makes a big difference with or without blur, because the unchanged pixels it uses for reference are physically closer, and usually more similar, to the ones at the seam.

But there are definitely cases where each of the correction methods doesn't help much, or makes one part of the seam better and another worse.

When I do a similar test to yours, it looks like on linear the slope of both the S and V corrections are below 1 on the full image, which is a sign it's probably making things worse instead of better, and then go back above 1 for either a mask shrink grow tight on the face or one that's wider but still mainly foreground and not background.

I guess one way the algorithm could try to adjust for this would be to not use all the thresholded pixels, but to use only thresholded pixels near unthresholded pixels, ie focusing on examples around the seam. The danger here is focusing too heavily on the seam could give a bad color correction for any new content that's not near the seam, like if the woman's head were completely surrounded by white and the algorithm matched the white boundary perfectly at the expense of removing a lot of contrast in the head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants