-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the text perturbation. #21
Comments
Hi @caosip, Sorry for the late reply; I've been a bit busy lately. Regarding the code for the adversarial text perturbation in Appendix C, I'll try to locate it, but please don't get your hopes up too much. It's been a while, and I may not have it anymore due to some file mismanagement. :( However, reproducing the results should be straightforward by applying some modifications to the code available in this repository. The mechanisms behind the text perturbation are explained in Appendix C. The key point is to make the text embeddings optimizable, using something like As for the comment about "text encoders (such as CLIP) not inherently including semantics about adversarial noise", I'm not entirely sure what is meant by "semantics about adversarial noise". My understanding is that since the prompt text guides the generation direction of the diffusion model, perturbing the prompt text can bias the generation direction and create images that might deceive the classifier. Hope this helps! |
Thank you @WindVChen, I initially thought that text perturbation was used to perform the token-level perturbations. I tried some discrete optimization methods such as Greedy Coordinate Gradient-based search, to find suitable prompts for generating adversarial examples, but it worked badly. Therefore, I suspected the text encoder may not have learned the representation of "adversarial noise." It seems that you are conducting adversarial attacks on word embeddings. I'll try this following your paper :) Once again, thanks for your reply! |
Yes, the discrete nature of prompt text offers a more limited search space for adversarial perturbations, which can result in poorer performance. If you want to conduct perturbations on discrete text, one approach could be to find perturbations in the continuous embeddings and then select the nearest hard prompt. You might find this work useful as a reference. Yet, it might still fail to generate effective attacks. |
Hello, I'm really interested in your work! However, I have some questions about the adversarial attack with text perturbation. In Table 5, the adversarial attack with only perturbation on the text could also work, although not the most effective. Could you provide the reproduction codes for this part? I'm curious about how to achieve such results because it seems that text encoders (such as CLIP) don't inherently include semantics about adversarial noise. I would greatly appreciate for your reply!
The text was updated successfully, but these errors were encountered: