Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch inputs for compute_clip_text_embedding #263

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

piercus
Copy link
Collaborator

@piercus piercus commented Feb 7, 2024

Context

I need this for #165, to do GPU efficient evaluation (batch_size > 1).
This is the extension of #213
This is also a baby step in the context of #255 @limiteinductive

One question

In compute_clip_image_embedding there is a concat_batches bool, i'm shared if we need to put it in compute_clip_text_embedding or not.

2 contradictory POV

@piercus
Copy link
Collaborator Author

piercus commented Feb 7, 2024

@limiteinductive just added corresponding changes + unit tests for SDXL

@piercus piercus force-pushed the batch-sd branch 2 times, most recently from 0164a98 to 2a03655 Compare February 8, 2024 16:43
@deltheil
Copy link
Member

In compute_clip_image_embedding there is a concat_batches bool, i'm shared if we need to put it in compute_clip_text_embedding or not.

@piercus no, you can ignore it IMO. It has been added mainly for documentation purposes, in the context of IP-Adapter with multiple image prompts (see #218). And in this context, you want to create a longer sequence of tokens (like double the size for two images).

Copy link
Contributor

@limiteinductive limiteinductive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deltheil lgtm

@deltheil
Copy link
Member

@deltheil lgtm

Thanks! @piercus will do the final round shortly, stay tuned

Copy link
Member

@deltheil deltheil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments/suggestions, please take a look

piercus added a commit to piercus/refiners that referenced this pull request Feb 20, 2024
@piercus
Copy link
Collaborator Author

piercus commented Feb 21, 2024

The batch-stability on GPU suffers from the below effect

Source : Torch Double Conv2d

In the above example, a pure pytorch double Conv2d leads to 2e-6 l1 norm diff

import torch
from torch.nn import Conv2d 
device = "cuda:0"

def distance (x: torch.Tensor, y: torch.Tensor) -> float:
    return torch.max((x - y).abs()).item()

with torch.no_grad():
    torch.cuda.manual_seed_all(0)
    x_b2 = torch.randn(2, 4, 32, 32).to(device)
    conv2d_1 = Conv2d(in_channels=4, out_channels=320, kernel_size=3, padding=1, device=device)
    conv2d_2 = Conv2d(in_channels=320, out_channels=640, kernel_size=3, padding=1, device=device)

    output_b2 = conv2d_2(conv2d_1(x_b2))
    output_b1 = conv2d_2(conv2d_1(x_b2[0:1]))

    print(distance(output_b2[0], output_b1[0]))

will output 2e-06

Amplification

This batch discrepancy is then amplified by the following layers

Level Error max((x - y).abs())
Double Conv 2e-6
Conv + ResidualBlock 5e-5
DownBlocks 2e-3
Unet 1e-3
SD 2e-3

Details of the analysis can be found in https://gist.github.com/piercus/07d03f258907542d312c0c735445e793

Result

As a results the batch stability with torch.allclose is only calculated with tolerance of 5e-3

piercus added a commit to piercus/refiners that referenced this pull request Feb 21, 2024
@deltheil comments on finegrain-ai#263, torch.allclose with tolerance of 5e-3, compact code following @limiteinductive suggestion
Co-authored-by: Cédric Deltheil <[email protected]>
@piercus piercus force-pushed the batch-sd branch 3 times, most recently from d3432ad to b3d53f1 Compare February 21, 2024 12:16
@piercus piercus requested a review from deltheil February 21, 2024 12:17
deltheil
deltheil previously approved these changes Feb 21, 2024
Copy link
Member

@deltheil deltheil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See final nits. Please squash the extra commit afterwards. Thanks!

tests/e2e/test_diffusion.py Outdated Show resolved Hide resolved
tests/e2e/test_diffusion.py Outdated Show resolved Hide resolved
@piercus
Copy link
Collaborator Author

piercus commented Feb 21, 2024

@deltheil please review

@piercus piercus requested a review from deltheil February 21, 2024 13:58
@deltheil deltheil added run-ci Run CI and removed run-ci Run CI labels Feb 21, 2024
@deltheil deltheil added run-ci Run CI and removed run-ci Run CI labels Feb 21, 2024
@deltheil deltheil merged commit d199cd4 into finegrain-ai:main Feb 21, 2024
2 checks passed
deltheil added a commit that referenced this pull request Feb 21, 2024
deltheil added a commit that referenced this pull request Feb 21, 2024
rodSiry pushed a commit to rodSiry/refiners that referenced this pull request Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-ci Run CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants