Adversarial Attack Detection
14 papers with code • 0 benchmarks • 0 datasets
The detection of adversarial attacks.
Benchmarks
These leaderboards are used to track progress in Adversarial Attack Detection
Most implemented papers
Maximum Mean Discrepancy Test is Aware of Adversarial Attacks
However, it has been shown that the MMD test is unaware of adversarial attacks -- the MMD test failed to detect the discrepancy between natural and adversarial data.
Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?
In its most commonly reported sub-task, RobustBench evaluates and ranks the adversarial robustness of trained neural networks on CIFAR10 under AutoAttack (Croce and Hein 2020b) with l-inf perturbations limited to eps = 8/255.
Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks
Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space.
Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness
Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training.
MetaAdvDet: Towards Robust Detection of Evolving Adversarial Attacks
To solve such few-shot problem with the evolving attack, we propose a meta-learning based robust detection method to detect new adversarial attacks with limited examples.
Towards Feature Space Adversarial Attack
We propose a new adversarial attack to Deep Neural Networks for image classification.
Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency
To this end, Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree.
Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection
In addition, we design a robust shape completion algorithm, which is guaranteed to remove the entire patch from the images if the outputs of the patch segmenter are within a certain Hamming distance of the ground-truth patch masks.
Residue-Based Natural Language Adversarial Attack Detection
Many popular image adversarial detection approaches are able to identify adversarial examples from embedding feature spaces, whilst in the NLP domain existing state of the art detection approaches solely focus on input text features, without consideration of model embedding spaces.
Detecting Adversarial Examples in Batches -- a geometrical approach
Many deep learning methods have successfully solved complex tasks in computer vision and speech recognition applications.