We address the problem of multi-armed adversarial attack detection. The detection methods are generally validated by assuming a single implicitly known attack strategy, which does not necessarily account for real-life threats. Indeed, this can lead to an overoptimistic assessment of the detectors’ performance and may induce some bias in comparing competing detection schemes. To overcome this limitation, we propose a novel multi-armed framework for evaluating detectors based on several attack strategies. Among them, we make use of three new objectives to generate attacks. The proposed performance metric is based on the worst-case scenario: detection is successful if and only if all different attacks are correctly recognized. Moreover, following this setting, we formally derive a simple yet effective method to aggregate the decisions of multiple trained detectors, possibly provided by a third party. While every single detector tends to underperform or fail at detecting types of attack that it has never seen at training time, our framework successfully aggregates the knowledge of the available detectors to guarantee a robust detection algorithm. The proposed method has many advantages: it is simple as it does not require further training of the given detectors; it is modular, allowing existing (and future) methods to be merged into a single one; it is general since it can simultaneously recognize adversarial examples created according to different algorithms and training (loss) objectives.
[1] Federica Granese, Marco Romanelli, Pablo Piantanida: Optimal Zero-Shot Detector for Multi-Armed Attacks. AISTATS 2024. [To appear]
[2] Federica Granese, Marine Picot, Marco Romanelli, Francisco Messina, Pablo Piantanida: MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors. ECML/PKDD (3) 2022: 286-303
Séminaire SoSySec : Optimal Zero-Shot Detector for Multi-Armed Adversarial Attacks
Seminar
Starting on
Ending on
Location
IRISA Rennes
Room
Amphithéâtre Inria Rennes
Speaker
Federica Granese (IRD)
Main department