Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change batch_size #26

Open
MDD-0928 opened this issue Sep 21, 2024 · 3 comments
Open

change batch_size #26

MDD-0928 opened this issue Sep 21, 2024 · 3 comments

Comments

@MDD-0928
Copy link

Dear author:
DiffAttack is such a novel and meaningful work for the community! Thanks to your contribution!

      I would like to consult you that whether I can change the batch_size when using the code, like change the batch_size to 32 in order to generate 32 images at one time. 
    
     And, which line of code in diff_attack_latent.py should I modify?
@WindVChen
Copy link
Owner

Hi @MDD-0928 ,

The most straightforward way to implement batch parallelism is by using multiprocessing to handle the image batches. You can modify the lines below

for ind, image_path in enumerate(all_images):
to split the processed images across different processes. This should be fairly simple and won't require changes to the core code in diff_attack_latent.py.

Also, please make sure you have enough computing resources, as processing a single image currently requires about 16GB of memory.

Hope this helps!

@MDD-0928
Copy link
Author

Thanks for your reply!!!
I would like to know why you define the "prmopt" by
prompt = [imagenet_label.refined_Label[label.item()] + " " + target_prompt] * 2
why the prompt needs to " *2 "
and why does the text_encoder first get prompt[0] as input in line 294 and secondly get prmopt as input in line 341

@WindVChen
Copy link
Owner

Hi @MDD-0928 ,

Sorry for the delayed reply.

The first prompt[0] is used to obtain the optimized uncond_embeddings for an empty prompt (i.e., ""), so we only need one latent for the calculation, corresponding to a single prompt text [text].

On the other hand, the second prompt, which contains two identical texts [text, text], is used for calculating the structure loss (refer to Section 3.4 of our paper). In this case, we require two latents: one is the original latent, and the other is the optimized latent. That's why we use a prompt with two text embeddings instead of prompt[0].

Hope this clears things up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants