-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support img2img and inpaint in lpw-xl #6114
Support img2img and inpaint in lpw-xl #6114
Conversation
Long prompt weighting has been amazing to use but it does not support img2img and inpaint in the XL version of the pipeline yet whereas it does in the normal version (here). I believe LPW is also used a lot, likely, and so this could be a good addition. From testing, it doesn't seem to disrupt any of the existing functionality so far. Will add some examples here soon. |
The pipeline adds two new methods to the LPW SDXL class: Codeimport torch
from PIL import Image
from diffusers.pipelines import DiffusionPipeline
model_id = "a-r-r-o-w/dreamshaper-xl-turbo"
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16", custom_pipeline="lpw_stable_diffusion_xl").to("cuda")
params = [
("an (extremely detailed cinematic digital painting masterpiece) (3/4 body shot) of one giant snowman in the distance absorbing the cool winter air wishing for a Christmas miracle, <lora:ChristmasWintery:0.8> ChristmasWinteryl surrounded by magical Christmas trees in a Christmas forest, bright glowing eyes, (Style-GravityMagic:1.1), (twisting swirls of snow in the background), (glowing aura), (energy), (snowing), (snowflakes:1.2), (windy), (moody color tones), (otherworldly:1.2), (elaborate textures:1.2), (soft lighting:1.2), medium depth of field, (8k UHD, Unreal Engine, highest quality)", 1024, 768),
("Photo of a beautiful greek goddess, RAW, ((red hair with extra long wavy), ((portrait)), (Detailed face: 1. 2)), (Detailed facial features)), fair-skinned, slim body, greek temple environment, detailed expressions, reflections, shot with a 50mm lens, f/2. 8, HDR, (8k), (movie lighting) (dramatic lighting) , (sharp focus), complex elements", 1024, 768),
("(full body, (dynamic pose)), muscular male android humanoid, flowing dark thick cloth cloak, dark metallic armour Burning Orange , neon parts, advanced civilization, towering structures, skyscrapers, immersed neon lights, fully detailed environment, cyberpunk, (rim lighting, studio lighting, distant moonlight, (night:1.1), bloom), (cinematic, best quality, masterpiece, ultra HD textures, highly detailed, hyper-realistic, intricate detail, 8k, photorealistic, concept art, matte painting, autodesk maya, vray render, ray tracing, hdr), (dslr, full frame, 16mm focal length, f/8 aperture, dynamic perspective, dynamic angle, wide field of view, deep depth of field) techwear, urbansamurai 3d, realistic, cyborg in a cyberhelmet head", 1024, 1024),
]
negative_prompt = '(nsfw:1.5),((3d, cartoon, anime, sketches)), (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), bad anatomy, out of view, cut off, ugly, deformed, mutated, ((young)), EasyNegative, paintings, sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glans,extra fingers,fewer fingers,, "(ugly eyes, deformed iris, deformed pupils, fused lips and teeth:1.2), (un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.2), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"'
t2i_images = []
for param in params:
prompt, height, width = param
images = pipe.text2img(
prompt=prompt,
negative_prompt=negative_prompt,
height=height,
width=width,
guidance_scale=3,
num_inference_steps=8,
num_images_per_prompt=1,
).images
t2i_images.append(images[0])
from PIL import Image
from typing import List
def image_grid(images: List[Image.Image], rows: int, cols: int) -> Image.Image:
if len(images) > rows * cols:
raise ValueError(
f"Number of images ({len(images)}) exceeds grid size ({rows}x{cols})."
)
w, h = images[0].size
grid = Image.new("RGB", size=(cols * w, rows * h))
for i, image in enumerate(images):
grid.paste(image, box=(i % cols * w, i // cols * h))
return grid
i2i_images = []
changes = ["(origami:1.5), artistic, paper art", "(nebulae, blackholes, quasars, stars, galatic, multiversal:1.6)", "(oil painting:1.5), artisitc, ((painting made by a 10 year old child))"]
for index in range(3):
for param, img in zip(params, t2i_images):
prompt, height, width = param
prompt = changes[index] + ', ' + ' '.join(prompt.split()[:15]) # idk???
images = pipe.img2img(
prompt=prompt,
negative_prompt=negative_prompt,
image=img,
height=height,
width=width,
guidance_scale=4,
num_inference_steps=12,
num_images_per_prompt=1,
strength=0.8,
).images
i2i_images.append(images[0])
grid_i2i = image_grid(i2i_images, 3, 3)
from diffusers.utils import load_image
image = load_image("image.png")
mask = load_image("mask.png")
inpaint_images = []
for i in range(6):
images = pipe.inpaint(
prompt="Digital illustration in of a samurai warrior in a duel against a (mystical:1.5) creature, (probably alien)",
negative_prompt=negative_prompt,
image=image,
mask_image=mask,
height=height,
width=width,
guidance_scale=2,
num_inference_steps=20,
num_images_per_prompt=1,
strength=(i + 1) * 0.15,
).images
inpaint_images.append(images[0])
grid_inpaint = image_grid(inpaint_images, 2, 3) |
@patrickvonplaten @sayakpaul Gentle ping for review. Do let me know if any changes are required, thanks. I'll update the community README to showcase use of img2img and inpaint methods soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool, I am good with the new changes
Test out your code, so far so good, the results are stunning good. Thank you @a-r-r-o-w |
Like your samples, would help test out if more samples code are shared. |
Those are great samples, I have ran it and all good, thought you want to share more sample code 😊 |
I have this ready-to-use Colab notebook. Sorry, but I'm having a little trouble understanding what you mean by sharing more sample code 😅 As far as testing the pipeline is concerned, I've run some exhaustive experiments playing around with different generation parameters. |
You already shared the samples after this, never mind, thought you will share some more examples besides these. Wondering if IPAdapter can be also added. |
Updated this comment with more examples 😉 I actually have thousands of generations from your original LPW pipeline alone when trying to run benchmarks comparing against normal SDXL pipeline to understand how exactly prompt weighting affects quality of generations. I will soon upload as a huggingface dataset maybe 👀 Sure, we can work together on adding IP Adapter in another PR maybe? |
I am occupied by other stuffs now, happy to work with you on the IP Adapter integration in another PR. |
@a-r-r-o-w sorry for the delay here. From the comments, I see that the results are quite nice. It would be good to have some mentions about this from the corresponding section of the README with some results. I think then we should good to merge. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Thanks! Not sure how I should link the generated images in the README. From other PRs, I've noticed that we have to open a PR to the HF documentation image hosting repo. Should I do that as well with some of these generations? For now, I've added a link to this PR in the README for someone wanting to see more results. |
Cool, good job! |
* add img2img and inpaint support to lpw-xl * update community README --------- Co-authored-by: Sayak Paul <[email protected]>
* add img2img and inpaint support to lpw-xl * update community README --------- Co-authored-by: Sayak Paul <[email protected]>
What does this PR do?
Adds support for img2img and inpainting in long prompt weighting XL community pipeline.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@patrickvonplaten @sayakpaul @xhinker