From 7b88eb95cfdb167749b9b76a87bb6de668698003 Mon Sep 17 00:00:00 2001 From: Ming Li Date: Thu, 11 Apr 2024 15:50:44 -0400 Subject: [PATCH] initial commit --- .DS_Store | Bin 0 -> 6148 bytes index.html | 80 ++++++++++++++++++++++++++++-- static/.DS_Store | Bin 0 -> 6148 bytes static/images/Controllability.png | Bin 0 -> 604772 bytes static/images/fid_and_clip.png | Bin 0 -> 542349 bytes static/images/seg_training.png | Bin 0 -> 142866 bytes 6 files changed, 76 insertions(+), 4 deletions(-) create mode 100644 .DS_Store create mode 100644 static/.DS_Store create mode 100644 static/images/Controllability.png create mode 100644 static/images/fid_and_clip.png create mode 100644 static/images/seg_training.png diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..a8ea52efbf4ef09d7fe27644c00f6be2ea344514 GIT binary patch literal 6148 zcmeHK%}T>S5Z-NTn^J@v6nb3nTClcKDPBUYFJMFuDm5`hgE3p0)Er77XMG``#OHBl zcXJ2^youNunEhtwXE*yn_J=XXC#&FyF`F@Ffg*A=Dg@n?p_)lXzm`$R#8Ohyh}N z7}!1r%voS{wr}aQVq$<8_<;f39|S0(V=&jKwhn0U`i$`gA_~~}mOvB+9fP?>h=6ci z3aCrDd17!~4t`5MxXsU-%8fieSi-L>)jKZjqY z@{wOIp&l_n4E!?&cw_2LC$K1Uwtib4p0xt>9ux)hax_4|E?okkgZoHV1$A7Y4S9~i WTqBNxepL=g7Xd{GHN?O#Fz^M>F-$K2 literal 0 HcmV?d00001 diff --git a/index.html b/index.html index 3aab700..5581d0d 100644 --- a/index.html +++ b/index.html @@ -93,7 +93,7 @@

Improving Conditional Controls
with - + @@ -119,7 +119,7 @@

Improving Conditional Controls
with class="interpolation-image" alt="Interpolate start reference image."/>

- (a) Given the same input image condition and text prompt, (b) the extracted conditions of our generated images are more consistent with the inputs, (c,d) while other methods fail to achieve accurate controllable generation. SSIM scores measure the similarity between all input edge conditions and the extracted edge conditions. All the line edges are extracted by the same line detection model used by ControlNet + (a) Given the same input image condition and text prompt, (b) the extracted conditions of our generated images are more consistent with the inputs, (c,d) while other methods fail to achieve accurate controllable generation. SSIM scores measure the similarity between all input edge conditions and the extracted edge conditions. All the line edges are extracted by the same line detection model used by ControlNet.

@@ -144,7 +144,7 @@

Abstract

- +

Cycle Consistency in Conditional Generation

@@ -163,7 +163,79 @@

+
+
+

Comparison with Exiting Efforts

+
+
+ Interpolate start reference image. +

+

+ (a) Existing methods achieve implicit controllability by introducing imagebased conditional control \( c_v \) into the denoising process of diffusion models, with the guidance of latent-space denoising loss. (b) We utilize discriminative reward models \( D \) to explicitly optimize the controllability of G via pixel-level cycle consistency loss. +

+

+
+
+
+
+ +
+
+

Efficient Reward Strategy

+
+
+ Interpolate start reference image. +

+

+ (a) Pipeline of default reward fine-tuning strategy. Reward fine-tuning requires sampling all the way to the full image. Such a method needs to keep all gradients for each timestep and the memory required is unbearable by current GPUs. (b) Pipeline of our efficient reward strategy. We add a small noise \( \epsilon_t (t \leq t_{thre} ) \) to disturb the consistency between input images and conditions, then the single-step denoised image can be directly used for efficient reward fine-tuning. +

+

+
+
+
+
+ + +
+
+

Better controllability without sacrificing FID and CLIP-Score

+
+
+ + Interpolate start reference image. + + Interpolate start reference image. +
+
+ +
+
+ + +
+
+

Facilitate Segmentation Task with Generated Images

+ +
+
+ + Interpolate start reference image. +
+
+ +
+