-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch inputs for compute_clip_text_embedding #263
Conversation
@limiteinductive just added corresponding changes + unit tests for SDXL |
0164a98
to
2a03655
Compare
src/refiners/foundationals/latent_diffusion/stable_diffusion_1/model.py
Outdated
Show resolved
Hide resolved
@piercus no, you can ignore it IMO. It has been added mainly for documentation purposes, in the context of IP-Adapter with multiple image prompts (see #218). And in this context, you want to create a longer sequence of tokens (like double the size for two images). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deltheil lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments/suggestions, please take a look
src/refiners/foundationals/latent_diffusion/stable_diffusion_1/model.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/stable_diffusion_xl/model.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/stable_diffusion_xl/text_encoder.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/stable_diffusion_xl/text_encoder.py
Outdated
Show resolved
Hide resolved
tests/foundationals/latent_diffusion/test_sdxl_double_encoder.py
Outdated
Show resolved
Hide resolved
The batch-stability on GPU suffers from the below effect Source : Torch Double Conv2dIn the above example, a pure pytorch double Conv2d leads to 2e-6 l1 norm diff import torch
from torch.nn import Conv2d
device = "cuda:0"
def distance (x: torch.Tensor, y: torch.Tensor) -> float:
return torch.max((x - y).abs()).item()
with torch.no_grad():
torch.cuda.manual_seed_all(0)
x_b2 = torch.randn(2, 4, 32, 32).to(device)
conv2d_1 = Conv2d(in_channels=4, out_channels=320, kernel_size=3, padding=1, device=device)
conv2d_2 = Conv2d(in_channels=320, out_channels=640, kernel_size=3, padding=1, device=device)
output_b2 = conv2d_2(conv2d_1(x_b2))
output_b1 = conv2d_2(conv2d_1(x_b2[0:1]))
print(distance(output_b2[0], output_b1[0])) will output AmplificationThis batch discrepancy is then amplified by the following layers
Details of the analysis can be found in https://gist.github.com/piercus/07d03f258907542d312c0c735445e793 ResultAs a results the batch stability with |
@deltheil comments on finegrain-ai#263, torch.allclose with tolerance of 5e-3, compact code following @limiteinductive suggestion Co-authored-by: Cédric Deltheil <[email protected]>
d3432ad
to
b3d53f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See final nits. Please squash the extra commit afterwards. Thanks!
@deltheil please review |
Co-authored-by: Cédric Deltheil <[email protected]>
Context
I need this for #165, to do GPU efficient evaluation (batch_size > 1).
This is the extension of #213
This is also a baby step in the context of #255 @limiteinductive
One question
In
compute_clip_image_embedding
there is aconcat_batches
bool, i'm shared if we need to put it incompute_clip_text_embedding
or not.2 contradictory POV