Artificial Intelligence is nowadays one of the most essential disciplines of computer science. These algorithms perform particularly well on Computer Vision tasks, especially classification. A classifier infers what an image represents. Nowadays Deep Neural Networks are largely used for these problems. These neural networks first undergo a training phase during which they are given many examples. These images are accompanied by labels: information on what the image represents. However, it was quickly found that the same logic used during the training phase could be used maliciously. This is the creation of Adversarial Examples through an Evasion Attack. Such examples are seemingly normal images. A human understands what it represents as if it was not manipulated. But the attacked classifier will make an incorrect prediction. In this manuscript, we study the creation of such examples, the reason for their existence, and the underlying vulnerability of classifiers. In particular, we study these examples in a realistic context. First, attacks are optimized (high success rate and low distortion). Second, we add the constraint that adversarial examples should be images. We thus work on spatially-quantized (PNG) or DCT-quantized images (JPEG).
Keywords: Deep Neural Networks, Adversarial Examples
- David PICARD, Professeur de l’École des Ponts ParisTech
- Gildas AVOINE, Institut National des Sciences Appliquées, Rennes
- Pascal FROSSARD, École Polytechnique Fédérale de Lausanne
- Cecilia PASQUINI, Fondazione Bruno Kessler