A Paper List for Localized Adversarial Patch Research

Changelog

07/2022: wrote two blog posts for adversarial patches and certifiably robust image classification post1 post2
03/2022: released the leaderboard for certified defenses for image classification against adversarial patches!!
11/2021: added explanations of different defense terminologies. added a few more recent papers.
08/2021: released the paper list!

What is the localized adversarial patch attack?

Different from classic adversarial examples that are configured to have a small L_p norm distance to the normal examples, a localized adversarial patch attacker can arbitrarily modify the pixel values within a small region.

The attack algorithm is similar to those for the classic L_p adversarial example attack. You define a loss function and then optimize your perturbation to attain the attack objective. The only difference is that now 1) you can only optimize over pixels within a small region, 2) but within that region, the pixel values can be arbitrary as long as they are valid pixels.

Example of localized adversarial patch attack (image from Brown et al.):

What makes this attack interesting?

It can be realized in the physical world!

Since all perturbations are within a small region, we can print and attach the patch in our physical world. This type of attack imposes a real-world threat on ML systems!

Note: not all existing physically-realizable attacks are in the category of patch attacks, but the localized patch attack is (one of) the simplest and the most popular physical attacks.

About this paper list

Focus

Test-time attacks/defenses (do not consider localized backdoor triggers)
2D computer vision tasks (e.g., image classification, object detection, image segmentation)
Localized attacks (do not consider other physical attacks that are more "global", e.g., some stop sign attacks which require changing the entire stop sign background)

Organization

I first categorize the papers based on the task: image classification vs object detection (and semantic segmentation, and other tasks)
I next group papers for attacks vs defenses.
I tried to organize each group of papers chronically. I consider two timestamps: the time when the preprint is available (e.g., arXiv) and the time when a (published) paper was submitted for peer-review.

I am actively developing this paper list (I haven't added notes for all papers). If you want to contribute to the paper list, add your paper, correct any of my comments, or share any of your suggestions, feel free to reach out :)

Defense Terminology

Empirically Robust Defenses vs Provably/certifiably Robust Defenses

There are two categories of defenses that have different robustness guarantees.

To evaluate the robustness of an empirical defense, we use concrete attack algorithms to attack the defense. The evaluated robustness does not have formal security guarantee: it might be compromised by smarter attackers in the future.

To evaluate a certified defense, we need to develop a robustness certification procedure to determine whether the defense has certifiable/provable robustness for a given input against a given threat model. We need to formally prove that the certification results will hold for any attack within the threat model, including ones that have full knowledge of the defense algorithm, setup, etc.

Notes:

We use certification procedure to evaluate the robustness of certified defenses; the certification procedure is agnostic to attack algorithms (i.e., it holds for any attack algorithm). As a bonus, this agnostic property lifts the burden of designing sophisticated adaptive attack algorithms for robustness evaluation. This is also why certified defenses usually do not provide "attack code" or "adversarial images" in their source code.
Strictly speaking, it is unfair to directly compare the performance of empirical defenses and certified defenses due to their different robustness notions.
1. Some empirical defenses might have (seemingly) high empirical robustness. However, those empirical defenses have zero certified robust accuracy, and their empirical robust accuracy might drop greatly given a smarter attacker.
2. On the other hand, the evaluated certified robustness is a provable lower bound on model performance against any empirical adaptive attack within the threat model.

Robust Prediction vs Attack Detection

There are also two different defense objectives (robustness notions).

Robust prediction (or prediction recovery) aims to always make correct decisions (e.g., correct classification label, correct bounding box detection), even in the presence of an attacker. It is also often called recovery-based defense.

Attack detection, on the other hand, only aims to detect an attack. If the defense detects an attack, it issues an alert and abstains from making predictions; if no attack is detected, it performs normal predictions. We can think of this type of defense as adding a special token "ALERT" to the model output space.

Notes:

Apparently, the robust prediction defense is harder than the attack detection defense. The detect-and-alert setup might be problematic when human fallback is unavailable.
Nevertheless, we are still interested in studying attack detection defense. There is an interesting paper discussing the connection between these two types of defense.
How do we evaluate the robustness of attack detection defense:
1. for clean images: perform the defense, and only consider a correct prediction when the output matches the ground-truth (alerts in the clean setting are errors!!)
2. for adversarial images: perform the defense, count a robust prediction if the prediction output is ALERT or matches the ground-truth

Image Classification

Attacks

Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition

CCS 2016

uses "adversarial glasses" to fool face recognition models

Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub Humanoid

arXiv 1708; ICCV Workshop 2017

adversarial example attack against robot vision
discusses the concept of localized perturbations, which could mimic attaching a sticker in the real world

Adversarial Patch

arXiv 1712; NeurIPS workshop 2017

The first paper that explicitly introduces the concept of adversarial patch attacks
Demonstrate a universal physical world attack

LaVAN: Localized and Visible Adversarial Noise

arXiv 1801; ICML 2018

Seems to be a concurrent work (?) as "Adversarial Patch"
Digital domain attack

Perceptual-Sensitive GAN for Generating Adversarial Patches

AAAI 2019

generate imperceptible patches.

Generate (non-software) Bugs to Fool Classifiers

AAAI 2020

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

arXiv 2004; ECCV 2020

a black-box attack via reinforcement learning

[Jacks of All Trades, Masters Of None: Addressing Distributional Shift and Obtrusiveness via Transparent Patch Attacks]

arXiv 2005

transparent patch

Robust Physical-World Attacks on Face Recognition

arXiv 2011; Pattern Recognition

A Data Independent Approach to Generate Adversarial Patches

Sprinter 2021

data independent attack; attack via increasing the magnitude of feature values

Enhancing Real-World Adversarial Patches with 3D Modeling Techniques

arXiv 2102; Neurocomputing

use 3D modeling to enhance physical-world patch attack

Meaningful Adversarial Stickers for Face Recognition in Physical World

arXiv 2104

add stickers to face to fool face recognition system

Improving Transferability of Adversarial Patches on Face Recognition with Generative Models

arXiv 2106; CVPR 2021

focus on transferability

Inconspicuous Adversarial Patches for Fooling Image Recognition Systems on Mobile Devices

arXiv 2106; an old version is available at arXiv 2009; IEEE Internet of Things Journal

generate small (inconspicuous) and localized perturbations

Patch Attack Invariance: How Sensitive are Patch Attacks to 3D Pose?

arXiv 2108; ICCVW 2021

consider physical-world patch attack in the 3-D space (images are taken from different angles)

Robust Adversarial Attack Against Explainable Deep Classification Models Based on Adversarial Images With Different Patch Sizes and Perturbation Ratios

IEEE Access

use patch to attack classification models and explanation models

Adversarial Token Attacks on Vision Transformers

arXiv 2110; CVPR Workshop 2022

An analysis of perturbing part of tokens of ViT

One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

arXiv 2110

Generative Dynamic Patch Attack

BMVC 2021

learn patch content and location (instead of a fixed/random patch location)

Adversarial Mask: Real-World Adversarial Attack Against Face Recognition Models

arXiv 2111; ECML/PKDD 2022

physical world attack via wearing a weird mask

TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems

arXiv 2111; TIFS

natural-looking patch attacks

Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?

ICLR 2022

not exactly a patch attack. use pixel patches to attack ViT

Defensive Patches for Robust Recognition in the Physical World

CVPR 2022

not exactly an attack. use patch for good purposes...

Adversarial Robustness is Not Enough: Practical Limitations for Securing Facial Authentication

IWSPA 2022

empirically analyze the robustness of face authetication against physical-world attacks

Surreptitious Adversarial Examples through Functioning QR Code

Journal of Imaging

combining QR code with adversarial patch

Adversarial Sticker: A Stealthy Attack Method in the Physical World

TPAMI

Adversarial Patch Attacks and Defences in Vision-based Task: A Survey

arXiv 2206

a survey; no experimental evaluation
Most (if not all) papers are covered in this paper.

an interesting paper that proposes a loss targeted at attention modules
studied both transformer-based image classification and object detection

Evaluating Model Robustness to Patch Perturbations

ICML 2022 Workshop

show that ViT is more robust to natural patch corruption, but more vulnerable to adversarial patch perturbations.

Shape Matters: Deformable Patch Attack

ECCV 2022

optimize the patch shape as well

Carpet-bombing patch: attacking a deep network without usual requirements

arXiv 2212

attacking the feature space. applicable to image classification, object detection, semantic segmentation
discuss the different behaviors of patch attacks and global perturbation attacks

Simultaneously Optimizing Perturbations and Positions for Black-box Adversarial Patch Attacks

TPAMI 2022

surrogate model + reinforcement learning. the patch position is from RL model.

Certified Defenses

Check out this leaderboard for certified robustness against adversarial patches!

The leaderboard provides a summary of all papers in this section! (todo: add ViP)

Certified Defenses for Adversarial Patches

ICLR 2020

The first certified defense.

Show that previous two empirical defenses (DW and LGS) are broken against an adaptive attacker
Adapt IBP (Interval Bound Propagation) for certified defense
Evaluate robustness against different shapes
Very expensive; only works for CIFAR-10 and small models

Clipped BagNet: Defending Against Sticker Attacks with Clipped Bag-of-features

IEEE S&P Workshop on Deep Learning Security 2020

Certified defense; clip BagNet features
Efficient

(De)Randomized Smoothing for Certifiable Defense against Patch Attacks

arXiv 2002, NeurIPS 2020

Certified defense; adapt ideas of randomized smoothing for $L_0$ adversary
Majority voting on predictions made from cropped pixel patches
Scale to ImageNet but expensive

Minority Reports Defense: Defending Against Adversarial Patches

arXiv 2004; ACNS workshop 2020

Certified defense for detecting an attack
Apply masks to the different locations of the input image and check inconsistency in masked predictions
Too expensive to scale to ImageNet (?)

PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

arXiv 2005; USENIX Security 2021

Certified defense framework with two general principles: small receptive field to bound the number of corrupted features and secure aggregation for final robust prediction
BagNet for small receptive fields; robust masking for secure aggregation, which detects and masks malicious feature values
Subsumes several existing and follow-up papers

Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Available on ICLR open review in 10/2020; ICLR 2021

Certified defense
BagNet to bound the number of corrupted features; Heaviside step function & majority voting for secure aggregation
Efficient, evaluate on different patch shapes

Certified Robustness against Physically-realizable Patch Attack via Randomized Cropping

Available on ICLR open review in 10/2020

Certified defense
Randomized image cropping + majority voting
only probabilistic certified robustness

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

arXiv 2104; ICLR workshop 2021

Certified defense for detecting an attack
A hybrid of PatchGuard and Minority Reports

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

NeurIPS 2021

certified defense for attack detection. a fun paper using ideas from both minority reports and PatchGuard++
The basic idea is to apply (pixel) masks and check prediction consistency
it further uses superficial important neurons (the neurons that contribute significantly to the shallow feature map values) to prune unimportant regions so that the number of masks is reduced.

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

arXiv 2108; USENIX Security 2022

Certified defense that is compatible with any state-of-the-art image classifier
huge improvements in clean accuracy and certified robust accuracy (its clean accuracy is close to SOTA image classifier)

Certified Patch Robustness via Smoothed Vision Transformers

arXiv 2110; CVPR 2022

Certified defense. ViT + De-randomized Smoothing
Drop tokens that correspond to pixel masks to greatly improve efficiency.

Towards Practical Certifiable Patch Defense with Vision Transformer

CVPR 2022

Certified defense. ViT + De-randomized Smoothing
A progressive training technique for smoothed ViT
Isolated (instead of global) self-attention to improve defense efficiency

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

arXiv 2111

certified defense for attack detection.
The idea is basically Minority Reports with Vision Transformer
The evaluation of clean accuracy seems problematic... (feel free to correct me if I am wrong)
1. For a clean image, the authors consider the model prediction to be correct even when the defense issues an attack alert

ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers

ECCV 2022

certified defense for both robust prediction and attack detection.
The robust-prediction (recovery-based) defense is similar to smoothed ViT (but with more different "masking" strategies); the detection-based defense is similar to Minority Reports.
They proposed a generalized window mask to handle the two-patch problem.
The clean performance evaluation of the detection-based defense does not count false alerts as errors (the authors say they are working on a revision to fix this issue; I will add this defense to the leadboard soon)

Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

TMLR

better training technique for PatchCleanser

Architecture-agnostic Iterative Black-box Certified Defense against Adversarial Patches

A Majority Invariant Approach to Patch Robustness Certification for Deep Learning Models

http://scis.scichina.com/en/2022/170306.pdf https://ieeexplore.ieee.org/abstract/document/9897387

Check out this leaderboard for certified robustness against adversarial patches!

(go back to table of contents)

Empirical Defenses

On Visible Adversarial Perturbations & Digital Watermarking

CVPR workshop 2018

The first empirical defense. Use saliency map to detect and mask adversarial patches.

Local Gradients Smoothing: Defense against Localized Adversarial Attacks

arXiv 1807; WACV 2019

An empirical defense. Use pixel gradient to detect patch and smooth in the suspected regions.

Ally patches for spoliation of adversarial patches

Journal of Big Data

An empirical defense, make prediction on pixel patches, and do majority voting
This paper is somehow missed by almost all relevant papers in this field (probably due to its venue); it only has one self-citation. However, its idea is quite similar to some certified defenses that are published in 2019-2020

Robust Synthesis of Adversarial Visual Examples Using a Deep Image Prior

arXiv 1907 BMVC 2019

briefly discuss localized patch attacks

Defending Against Physically Realizable Attacks on Image Classification

arXiv 1909, ICLR 2020

Empirical defense via adversarial training
Interestingly show that adversarial training for patch attack does not hurt model clean accuracy
Only works on small images

SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems

arXiv 1812; IEEE S&P Workshop on Deep Learning Security 2020

Empirical defense that leverages the universality of the attack (inapplicable to non-universal attacks)

Detecting Patch Adversarial Attacks with Image Residuals

arXiv 2002

empirical defense

Adversarial Training against Location-Optimized Adversarial Patches

arXiv 2005, ECCV workshop 2020

empirical defense via adversarial training (in which the patch location is being optimized)
move the patch toward the direction where loss increases

Vax-a-Net: Training-time Defence Against Adversarial Patch Attacks

arXiv 2009; ACCV 2020

use conditional GAN to generate patch for adversarial training

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

arXiv 2012

empirical defense; directly use CompNet to defend against black-box patch attack (evaluated with PatchAttack)

Compositional Generative Networks and Robustness to Perceptible Image Changes

CISS 2021

An empirical defense against black-box patch attacks
A direct application of CompNet

Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis

arXiv 2102; INFOCOM 2021

empirical defense for attack detection

A Novel Lightweight Defense Method Against Adversarial Patches-Based Attacks on Automated Vehicle Make and Model Recognition Systems

Journal of Network and Systems Management

empirical defense. require the assumption of horizontal symmetry of the image. only applicable to a certain scenario.

Real-time Detection of Practical Universal Adversarial Perturbations

arXiv 2105

An empirical defense that uses the magnitude and variance of the feature map values to detect an attack
focus more on the universal attack (both localized patch and global perturbations)

Defending against Adversarial Patches with Robust Self-Attention

ICML UDL workshop

empirical defense. detect and remove outliers in ViT

Jujutsu: A Two-stage Defense against Adversarial Patch Attacks on Deep Neural Networks

old title:Turning Your Strength against You: Detecting and Mitigating Robust and Universal Adversarial Patch Attack

arXiv 2108; AsiaCCS 2023

empirical defense; use universality
similar to SentiNet without some improvements: detect critical region and overlay it to a different image

Defending Against Universal Adversarial Patches by Clipping Feature Norms

ICCV 2021

empirical defense via clipping feature norm.
Oddly, this paper does not cite Clipped BagNet

Efficient Training Methods for Achieving Adversarial Robustness Against Sparse Attacks

ICCV workshop 2021

Detecting Adversarial Patch Attacks through Global-local Consistency

MM workshop 2021

make predictions on small image region. train a classifier on predictions on different regions.
discuss "provable robustness"

ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches

arXiv 2203; ICML workshop 2022

A dataset for adversarial patches
Clarification from the authors: the main purpose of the ImageNet-Patch dataset is to provide a fast benchmark of models against patch attacks, but not strictly related to defenses against adversarial patches, which is why they did not cite any adversarial patch defense papers.
(I ran some experiments; the transferability to architectures like ViT and ResMLP seemed low)

Object Detection

Attacks

DPATCH: An Adversarial Patch Attack on Object Detectors

arXiv 1806; AAAI workshop 2019

The first (?) patch attack against object detector

Fooling automated surveillance cameras: adversarial patches to attack person detection

arXiv 1904; CVPR workshop 2019

using a rigid board printed with adversarial perturbations to evade the detection of a person

On Physical Adversarial Patches for Object Detection

arXiv 1906

interestingly show that a physical-world patch in the background (far away from the victim objects) can have malicious effect

use a non-rigid T-shirt to evade person detection

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors

arXiv 1910; ECCV 2020

wear an ugly T-shirt to evade person detection

APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection

arXiv 1912; ECCV 2020

a dataset with annotated patch locations.

not really a conventional patch attacks. more like a structured L0 attacks
briefly discuss that two-stage detector like Faster-RCNN is harder to attack compared with one-stage detector like YOLO

Object Hider: Adversarial Patch Attack Against Object Detectors

arXiv 2010

add multiple small patches
use heatmap to find sensitive location to place the patches

Dynamic Adversarial Patch for Evading Object Detection Models

arXiv 2010

dynamic -- the patch location/position changes as the camera moves around

Exploring Adversarial Robustness of Multi-sensor Perception Systems in Self Driving

CoRL

attacking 3D and 2D perception at the same time

The Translucent Patch: A Physical and Universal Attack on Object Detectors

arXiv 2012; CVPR 2021

not really a patch attack. put a translucent patch in front of the camera

RPATTACK: Refined Patch Attack on General Object Detectors

arXiv 2103; ICME 2021

achieve high attack performance using a small number of pixels (e.g, 0.32% pixels)
the adversarial pixels are scatter across the entire image and different objects, more like a L0 attack

Physical Adversarial Attacks on an Aerial Imagery Object Detector

arXiv 2108

You Cannot Easily Catch Me: A Low-Detectable Adversarial Patch for Object Detectors

arXiv 2109

attack against object detectors that can also evade attack-detection models.

Naturalistic Physical Adversarial Patch for Object Detectors

ICCV 2021

an improved attack from adversarial T-shirt. The patch looks more natural (e.g., a dog)

Legitimate Adversarial Patches: Evading Human Eyes and Detection Models in the Physical World

MM 2021

an improved attack from adversarial T-shirt. The patch looks more natural (e.g., an Ivysaur!)

Developing Imperceptible Adversarial Patches to Camouflage Military Assets From Computer Vision Enabled Technologies

Adversarial Texture for Fooling Person Detectors in the Physical World

CVPR 2022

consider cameras from different angles

Physical Adversarial Attack on a Robotic Arm

RA-L 2022

use adversarial patches to attack object detectors, in a robotic arm setting

demonstrate that existing physical patch attack might not be that effective when consider other components of autonomous car (e.g., object tracking)

Benchmarking Adversarial Patch Against Aerial Detection

arXiv 2210

TPatch: A Triggered Physical Adversarial Patch

USENIX Security 2023

physical adversarial patch triggered by acoustic signals

TransPatch: A Transformer-based Generator for Accelerating Transferable Patch Generation in Adversarial Attacks Against Object Detection Models

ECCV 2023

Certified Defenses

DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks

arXiv 2102; CCS 2021

The first certified defense for patch hiding attack
Adapt robust image classifiers for robust object detection
Provable robustness at a negligible cost of clean performance

ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking

arXiv 2202

use pixel masks to remove the adversarial patch in a certifiably robust manner.
a significant improvement in certified robustness
also discuss different robustness notions

(go back to table of contents)

Empirical Defenses

Role of Spatial Context in Adversarial Robustness for Object Detection

arXiv 1910; CVPR workshop 2020

The first empirical defense, adding a regularization loss to constrain the use of spatial information
only experiment on YOLOv2 and small datasets like PASCAL VOC

Meta Adversarial Training against Universal Patches

arXiv 2101; ICML 2021 workshop

Adversarial YOLO: Defense Human Detection Patch Attacks via Detecting Adversarial Patches

arXiv 2103

Empirical defense via adding adversarial patches and a "patch" class during the training

We Can Always Catch You: Detecting Adversarial Patched Objects WITH or WITHOUT Signature

arXiv 2106

Two empirical defenses for patch hiding attack
Feed small image region to the detector; grows the region with some heuristics; detect an attack when YOLO detects objects in a smaller region but misses objects in a larger expanded region.

Adversarial Pixel Masking: A Defense against Physical Attacks for Pre-trained Object Detectors

MM 2021

Empirical defense. Adversarially train a "MaskNet" to detect and mask the patch

Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

CVPR 2022

Defending From Physically-Realizable Adversarial Attacks Through Internal Over-Activation Analysis

arXiv 2203

use z-score in each layer to detect patches.

Defending Against Person Hiding Adversarial Patch Attack with a Universal White Frame

TIP

use a GAN-like training strategy to train a "white frame". The frame is attached to the image to invalidate patch attacks.

PatchZero: Defending against Adversarial Patch Attacks by Detecting and Zeroing the Patch

arxiv 2207

detect adversarial patches and remove them based on their different textures. consider different tasks like image classification, object detection, and video analysis

train a network to detect and mask the patch

Semantic Segmentation

Attacks

Indirect Local Attacks for Context-aware Semantic Segmentation Networks

ECCV 2020

indirect -- the malicious pixels do not overlap with the victim pixels/objects
interestingly shows that more advanced models are more vulnerable to "indirect" patch attacks, since they leverage more "context" information.

IPatch: A Remote Adversarial Patch

arXiv 2105

remote -- the patch is away from the victim pixels

attacking the feature space. applicable to image classification, object detection, semantic segmentation
discuss the different behaviors of patch attacks and global perturbation attacks

Defenses

Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation

arXiv 2209

The first certified defense for image segmentation
Use different masking strategies for attack detection or robust prediction
Perform inpainting after masking to improve performance
Unfortunately, this paper does not count false alerts of detection-based defense as errors in the clean setting