Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the text perturbation. #21

Open
caosip opened this issue Jul 23, 2024 · 3 comments
Open

Question about the text perturbation. #21

caosip opened this issue Jul 23, 2024 · 3 comments

Comments

@caosip
Copy link

caosip commented Jul 23, 2024

Hello, I'm really interested in your work! However, I have some questions about the adversarial attack with text perturbation. In Table 5, the adversarial attack with only perturbation on the text could also work, although not the most effective. Could you provide the reproduction codes for this part? I'm curious about how to achieve such results because it seems that text encoders (such as CLIP) don't inherently include semantics about adversarial noise. I would greatly appreciate for your reply!

@WindVChen
Copy link
Owner

Hi @caosip,

Sorry for the late reply; I've been a bit busy lately.

Regarding the code for the adversarial text perturbation in Appendix C, I'll try to locate it, but please don't get your hopes up too much. It's been a while, and I may not have it anymore due to some file mismanagement. :(

However, reproducing the results should be straightforward by applying some modifications to the code available in this repository. The mechanisms behind the text perturbation are explained in Appendix C. The key point is to make the text embeddings optimizable, using something like nn.Parameter to backpropagate gradients and update the embeddings. If you need any specific details for your reproduction, feel free to ask.

As for the comment about "text encoders (such as CLIP) not inherently including semantics about adversarial noise", I'm not entirely sure what is meant by "semantics about adversarial noise". My understanding is that since the prompt text guides the generation direction of the diffusion model, perturbing the prompt text can bias the generation direction and create images that might deceive the classifier.

Hope this helps!

@caosip
Copy link
Author

caosip commented Jul 25, 2024

Thank you @WindVChen,

I initially thought that text perturbation was used to perform the token-level perturbations. I tried some discrete optimization methods such as Greedy Coordinate Gradient-based search, to find suitable prompts for generating adversarial examples, but it worked badly. Therefore, I suspected the text encoder may not have learned the representation of "adversarial noise."

It seems that you are conducting adversarial attacks on word embeddings. I'll try this following your paper :)

Once again, thanks for your reply!

@WindVChen
Copy link
Owner

WindVChen commented Jul 25, 2024

Yes, the discrete nature of prompt text offers a more limited search space for adversarial perturbations, which can result in poorer performance. If you want to conduct perturbations on discrete text, one approach could be to find perturbations in the continuous embeddings and then select the nearest hard prompt. You might find this work useful as a reference. Yet, it might still fail to generate effective attacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants