Skip to content

Commit

Permalink
Fix teaser.jpg (now ctrl-x.jpg) not displaying on GitHub pages problem
Browse files Browse the repository at this point in the history
  • Loading branch information
Jordan Lin committed Jun 11, 2024
1 parent 33b7a1e commit e179965
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 17 deletions.
File renamed without changes
18 changes: 1 addition & 17 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
<a href="#" target="_blank">[Code (coming soon!)]</a>
</div>
<div class="teaser">
<img src="assets/teaser.jpg" width="85%">
<img src="assets/ctrl-x.jpg" width="85%">
</div>
</div>
<!-- === Home Section Ends === -->
Expand All @@ -60,22 +60,6 @@
<div class="title">Overview</div>
<div class="body">
We present <b>Ctrl-X</b>, a simple <i>training-free</i> and <i>guidance-free</i> framework for text-to-image (T2I) generation with structure and appearance control. Given user-provided structure and appearance images, Ctrl-X designs feedforward structure control to enable structure alignment with the structure image and semantic-aware appearance transfer to facilitate the appearance transfer from the appearance image. Ctrl-X supports novel structure control with arbitrary condition images of any modality, is significantly faster than prior training-free appearance transfer methods, and provides instant plug-and-play to any T2I and text-to-video (T2V) diffusion model.
<!-- Recent controllable generation approaches such as FreeControl and Diffusion
Self-guidance bring fine-grained spatial and appearance control to text-to-
image (T2I) diffusion models without training auxiliary modules. However, these
methods optimize the latent embedding for each type of score function with longer
diffusion steps, making the generation process time-consuming and limiting their
flexibility and use. This work presents Ctrl-X, a simple framework for T2I diffusion
controlling structure and appearance without additional training or guidance. Ctrl-X
designs feed-forward structure control to enable the structure alignment with a
structure image and semantic-aware appearance transfer to facilitate the appearance
transfer from a user-input image. Extensive qualitative and quantitative experiments
illustrate the superior performance of Ctrl-X on various condition inputs and model
checkpoints. In particular, Ctrl-X supports novel structure and appearance control
with arbitrary condition images of any modality, exhibits superior image quality and
appearance transfer compared to existing works, and provides instant plug-and-play
to any T2I and text-to-video (T2V) diffusion model. -->

<table width="100%" style="margin: 20pt 0; text-align: center;">
<tr>
<td><img src="assets/pipeline.jpg" width="85%"></td>
Expand Down

0 comments on commit e179965

Please sign in to comment.