Support img2img and inpaint in lpw-xl #6114

a-r-r-o-w · 2023-12-09T15:32:51Z

What does this PR do?

Adds support for img2img and inpainting in long prompt weighting XL community pipeline.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

a-r-r-o-w · 2023-12-09T15:34:08Z

Long prompt weighting has been amazing to use but it does not support img2img and inpaint in the XL version of the pipeline yet whereas it does in the normal version (here). I believe LPW is also used a lot, likely, and so this could be a good addition. From testing, it doesn't seem to disrupt any of the existing functionality so far. Will add some examples here soon.

a-r-r-o-w · 2023-12-16T15:59:10Z

The pipeline adds two new methods to the LPW SDXL class: .img2img() and .inpaint() similar to the LPW SD1.5 pipeline linked above.

Code

import torch
from PIL import Image

from diffusers.pipelines import DiffusionPipeline

model_id = "a-r-r-o-w/dreamshaper-xl-turbo"
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16", custom_pipeline="lpw_stable_diffusion_xl").to("cuda")

params = [
    ("an (extremely detailed cinematic digital painting masterpiece) (3/4 body shot) of one giant snowman in the distance absorbing the cool winter air wishing for a Christmas miracle, <lora:ChristmasWintery:0.8> ChristmasWinteryl surrounded by magical Christmas trees in a Christmas forest, bright glowing eyes, (Style-GravityMagic:1.1), (twisting swirls of snow in the background), (glowing aura), (energy), (snowing), (snowflakes:1.2), (windy), (moody color tones), (otherworldly:1.2), (elaborate textures:1.2), (soft lighting:1.2), medium depth of field, (8k UHD, Unreal Engine, highest quality)", 1024, 768),
    ("Photo of a beautiful greek goddess, RAW, ((red hair with extra long wavy), ((portrait)), (Detailed face: 1. 2)), (Detailed facial features)), fair-skinned, slim body, greek temple environment, detailed expressions, reflections, shot with a 50mm lens, f/2. 8, HDR, (8k), (movie lighting) (dramatic lighting) , (sharp focus), complex elements", 1024, 768),
    ("(full body, (dynamic pose)), muscular male android humanoid, flowing dark thick cloth cloak, dark metallic armour Burning Orange , neon parts, advanced civilization, towering structures, skyscrapers, immersed neon lights, fully detailed environment, cyberpunk, (rim lighting, studio lighting, distant moonlight, (night:1.1), bloom), (cinematic, best quality, masterpiece, ultra HD textures, highly detailed, hyper-realistic, intricate detail, 8k, photorealistic, concept art, matte painting, autodesk maya, vray render, ray tracing, hdr), (dslr, full frame, 16mm focal length, f/8 aperture, dynamic perspective, dynamic angle, wide field of view, deep depth of field) techwear, urbansamurai 3d, realistic, cyborg in a cyberhelmet head", 1024, 1024),
]
negative_prompt = '(nsfw:1.5),((3d, cartoon, anime, sketches)), (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), bad anatomy, out of view, cut off, ugly, deformed, mutated, ((young)), EasyNegative, paintings, sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glans,extra fingers,fewer fingers,, "(ugly eyes, deformed iris, deformed pupils, fused lips and teeth:1.2), (un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.2), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"'

t2i_images = []
for param in params:
    prompt, height, width = param
    images = pipe.text2img(
        prompt=prompt,
        negative_prompt=negative_prompt,
        height=height,
        width=width,
        guidance_scale=3,
        num_inference_steps=8,
        num_images_per_prompt=1,
    ).images
    t2i_images.append(images[0])

from PIL import Image
from typing import List

def image_grid(images: List[Image.Image], rows: int, cols: int) -> Image.Image:
    if len(images) > rows * cols:
        raise ValueError(
            f"Number of images ({len(images)}) exceeds grid size ({rows}x{cols})."
        )
    w, h = images[0].size
    grid = Image.new("RGB", size=(cols * w, rows * h))
    for i, image in enumerate(images):
        grid.paste(image, box=(i % cols * w, i // cols * h))
    return grid

i2i_images = []
changes = ["(origami:1.5), artistic, paper art", "(nebulae, blackholes, quasars, stars, galatic, multiversal:1.6)", "(oil painting:1.5), artisitc, ((painting made by a 10 year old child))"]
for index in range(3):
    for param, img in zip(params, t2i_images):
        prompt, height, width = param
        prompt = changes[index] + ', ' + ' '.join(prompt.split()[:15]) # idk???
        images = pipe.img2img(
            prompt=prompt,
            negative_prompt=negative_prompt,
            image=img,
            height=height,
            width=width,
            guidance_scale=4,
            num_inference_steps=12,
            num_images_per_prompt=1,
            strength=0.8,
        ).images
        i2i_images.append(images[0])

grid_i2i = image_grid(i2i_images, 3, 3)

from diffusers.utils import load_image
image = load_image("image.png")
mask = load_image("mask.png")

inpaint_images = []
for i in range(6):
    images = pipe.inpaint(
        prompt="Digital illustration in of a samurai warrior in a duel against a (mystical:1.5) creature, (probably alien)",
        negative_prompt=negative_prompt,
        image=image,
        mask_image=mask,
        height=height,
        width=width,
        guidance_scale=2,
        num_inference_steps=20,
        num_images_per_prompt=1,
        strength=(i + 1) * 0.15,
    ).images
    inpaint_images.append(images[0])

grid_inpaint = image_grid(inpaint_images, 2, 3)

Results

All results generated based on the above code.

Text to Image

Image to Image

Inpaint
Image		Mask

a-r-r-o-w · 2023-12-16T16:01:52Z

@patrickvonplaten @sayakpaul Gentle ping for review. Do let me know if any changes are required, thanks. I'll update the community README to showcase use of img2img and inpaint methods soon.

xhinkerzhu

Super cool, I am good with the new changes

xhinker · 2023-12-16T20:09:49Z

Test out your code, so far so good, the results are stunning good. Thank you @a-r-r-o-w

xhinker · 2023-12-16T20:12:44Z

Long prompt weighting has been amazing to use but it does not support img2img and inpaint in the XL version of the pipeline yet whereas it does in the normal version (here). I believe LPW is also used a lot, likely, and so this could be a good addition. From testing, it doesn't seem to disrupt any of the existing functionality so far. Will add some examples here soon.

Like your samples, would help test out if more samples code are shared.

a-r-r-o-w · 2023-12-16T21:41:34Z

Like your samples, would help test out if more samples code are shared.

Hey @xhinker, thank you for the review and kind words! I've attached the code I used for generation in the same comment as the results (here). Is this what you mean?

xhinker · 2023-12-16T21:43:37Z

Like your samples, would help test out if more samples code are shared.

Hey @xhinker, thank you for the review and kind words! I've attached the code I used for generation in the same comment as the results (here). Is this what you mean?

Those are great samples, I have ran it and all good, thought you want to share more sample code 😊

a-r-r-o-w · 2023-12-16T21:52:50Z

I have this ready-to-use Colab notebook. Sorry, but I'm having a little trouble understanding what you mean by sharing more sample code 😅 As far as testing the pipeline is concerned, I've run some exhaustive experiments playing around with different generation parameters.

Some more results based on PartiPrompts

augmented with prompt weighting

xhinker · 2023-12-16T22:02:48Z

Will add some examples here soon.

You already shared the samples after this, never mind, thought you will share some more examples besides these.
With text2img, img2img, and inpaint2img all in one SDXL pipeline with weighted long prompt, I see huge potentials for other works.

Wondering if IPAdapter can be also added.

a-r-r-o-w · 2023-12-16T22:09:05Z

Updated this comment with more examples 😉 I actually have thousands of generations from your original LPW pipeline alone when trying to run benchmarks comparing against normal SDXL pipeline to understand how exactly prompt weighting affects quality of generations. I will soon upload as a huggingface dataset maybe 👀

Sure, we can work together on adding IP Adapter in another PR maybe?

xhinker · 2023-12-16T22:12:44Z

Updated this comment with more examples 😉 I actually have thousands of generations from your original LPW pipeline alone when trying to run benchmarks comparing against normal SDXL pipeline to understand how exactly prompt weighting affects quality of generations. I will soon upload as a huggingface dataset maybe 👀

Sure, we can work together on adding IP Adapter in another PR maybe?

I am occupied by other stuffs now, happy to work with you on the IP Adapter integration in another PR.

sayakpaul · 2023-12-17T02:43:01Z

@a-r-r-o-w sorry for the delay here. From the comments, I see that the results are quite nice. It would be good to have some mentions about this from the corresponding section of the README with some results.

I think then we should good to merge.

HuggingFaceDocBuilderDev · 2023-12-17T02:49:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

a-r-r-o-w · 2023-12-17T18:46:00Z

@a-r-r-o-w sorry for the delay here. From the comments, I see that the results are quite nice. It would be good to have some mentions about this from the corresponding section of the README with some results.

I think then we should good to merge.

Thanks! Not sure how I should link the generated images in the README. From other PRs, I've noticed that we have to open a PR to the HF documentation image hosting repo. Should I do that as well with some of these generations? For now, I've added a link to this PR in the README for someone wanting to see more results.

patrickvonplaten · 2023-12-18T18:19:14Z

Cool, good job!

* add img2img and inpaint support to lpw-xl * update community README --------- Co-authored-by: Sayak Paul <[email protected]>

a-r-r-o-w added 2 commits December 8, 2023 02:47

add img2img and inpaint support to lpw-xl

13a8c8c

Merge branch 'main' into lpwxl-img2img-inpaint

81601cb

a-r-r-o-w marked this pull request as ready for review December 16, 2023 15:59

xhinkerzhu approved these changes Dec 16, 2023

View reviewed changes

xhinker approved these changes Dec 16, 2023

View reviewed changes

Merge branch 'main' into lpwxl-img2img-inpaint

0e07077

a-r-r-o-w added 2 commits December 18, 2023 00:10

update community README

373b0b2

Merge branch 'main' into lpwxl-img2img-inpaint

423edbd

patrickvonplaten merged commit 67b3d32 into huggingface:main Dec 18, 2023
14 checks passed

a-r-r-o-w deleted the lpwxl-img2img-inpaint branch December 18, 2023 19:04

donhardman pushed a commit to donhardman/diffusers that referenced this pull request Dec 29, 2023

Support img2img and inpaint in lpw-xl (huggingface#6114)

b7f633a

* add img2img and inpaint support to lpw-xl * update community README --------- Co-authored-by: Sayak Paul <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support img2img and inpaint in lpw-xl #6114

Support img2img and inpaint in lpw-xl #6114

a-r-r-o-w commented Dec 9, 2023 •

edited

Loading

a-r-r-o-w commented Dec 9, 2023 •

edited

Loading

a-r-r-o-w commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023

xhinkerzhu left a comment

xhinker commented Dec 16, 2023

xhinker commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023 •

edited

Loading

xhinker commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023 •

edited

Loading

xhinker commented Dec 16, 2023 •

edited

Loading

a-r-r-o-w commented Dec 16, 2023

xhinker commented Dec 16, 2023

sayakpaul commented Dec 17, 2023

HuggingFaceDocBuilderDev commented Dec 17, 2023

a-r-r-o-w commented Dec 17, 2023

patrickvonplaten commented Dec 18, 2023

Support img2img and inpaint in lpw-xl #6114

Support img2img and inpaint in lpw-xl #6114

Conversation

a-r-r-o-w commented Dec 9, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

a-r-r-o-w commented Dec 9, 2023 • edited Loading

a-r-r-o-w commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023

xhinkerzhu left a comment

Choose a reason for hiding this comment

xhinker commented Dec 16, 2023

xhinker commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023 • edited Loading

xhinker commented Dec 16, 2023

a-r-r-o-w commented Dec 16, 2023 • edited Loading

xhinker commented Dec 16, 2023 • edited Loading

a-r-r-o-w commented Dec 16, 2023

xhinker commented Dec 16, 2023

sayakpaul commented Dec 17, 2023

HuggingFaceDocBuilderDev commented Dec 17, 2023

a-r-r-o-w commented Dec 17, 2023

patrickvonplaten commented Dec 18, 2023

a-r-r-o-w commented Dec 9, 2023 •

edited

Loading

a-r-r-o-w commented Dec 9, 2023 •

edited

Loading

a-r-r-o-w commented Dec 16, 2023 •

edited

Loading

a-r-r-o-w commented Dec 16, 2023 •

edited

Loading

xhinker commented Dec 16, 2023 •

edited

Loading