Question about the text perturbation. #21

caosip · 2024-07-23T16:32:37Z

Hello, I'm really interested in your work! However, I have some questions about the adversarial attack with text perturbation. In Table 5, the adversarial attack with only perturbation on the text could also work, although not the most effective. Could you provide the reproduction codes for this part? I'm curious about how to achieve such results because it seems that text encoders (such as CLIP) don't inherently include semantics about adversarial noise. I would greatly appreciate for your reply!

WindVChen · 2024-07-25T12:51:17Z

Hi @caosip,

Sorry for the late reply; I've been a bit busy lately.

Regarding the code for the adversarial text perturbation in Appendix C, I'll try to locate it, but please don't get your hopes up too much. It's been a while, and I may not have it anymore due to some file mismanagement. :(

However, reproducing the results should be straightforward by applying some modifications to the code available in this repository. The mechanisms behind the text perturbation are explained in Appendix C. The key point is to make the text embeddings optimizable, using something like nn.Parameter to backpropagate gradients and update the embeddings. If you need any specific details for your reproduction, feel free to ask.

As for the comment about "text encoders (such as CLIP) not inherently including semantics about adversarial noise", I'm not entirely sure what is meant by "semantics about adversarial noise". My understanding is that since the prompt text guides the generation direction of the diffusion model, perturbing the prompt text can bias the generation direction and create images that might deceive the classifier.

Hope this helps!

caosip · 2024-07-25T13:56:28Z

Thank you @WindVChen,

I initially thought that text perturbation was used to perform the token-level perturbations. I tried some discrete optimization methods such as Greedy Coordinate Gradient-based search, to find suitable prompts for generating adversarial examples, but it worked badly. Therefore, I suspected the text encoder may not have learned the representation of "adversarial noise."

It seems that you are conducting adversarial attacks on word embeddings. I'll try this following your paper :)

Once again, thanks for your reply!

WindVChen · 2024-07-25T16:14:28Z

Yes, the discrete nature of prompt text offers a more limited search space for adversarial perturbations, which can result in poorer performance. If you want to conduct perturbations on discrete text, one approach could be to find perturbations in the continuous embeddings and then select the nearest hard prompt. You might find this work useful as a reference. Yet, it might still fail to generate effective attacks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the text perturbation. #21

Question about the text perturbation. #21

caosip commented Jul 23, 2024

WindVChen commented Jul 25, 2024

caosip commented Jul 25, 2024

WindVChen commented Jul 25, 2024 •

edited

Loading

Question about the text perturbation. #21

Question about the text perturbation. #21

Comments

caosip commented Jul 23, 2024

WindVChen commented Jul 25, 2024

caosip commented Jul 25, 2024

WindVChen commented Jul 25, 2024 • edited Loading

WindVChen commented Jul 25, 2024 •

edited

Loading