Abstract—Neural networks have recently been attracting attention again as classifiers with high accuracy, so called “deep learning,” which is applied in a wide variety of fields. However, this advanced machine learning algorithms are vulnerable to adversarial perturbations. Although they cannot be recognized by humans, these perturbations deliver a fatal blow to the estimation ability of classifiers. Thus, while humans perceive perturbed examples as being the same as the original natural examples, sophisticated classifiers identify them as completely different examples. Although several defensive measures against such adversarial examples have been suggested, they are known to fail in undesirable phenomena, gradient masking. Gradient masking can neutralize the useful gradient for adversaries, but adversarial perturbations tend to transfer across most models, and these models can be deceived by adversarial examples crafted based on other models, which is called a black-box attack. Therefore, it is necessary to develop training methods to withstand black-box attacks and conduct studies to investigate the weak points of current NN training. This paper argues that no special defensive measures are necessary for NN to fall into gradient masking, and it is sufficient to slightly change the initial learning rate of Adam from the recommended value. Moreover, our experiment implies that gradient masking is a type of overfitting.
Index Terms—Adam, adversarial examples, gradient masking, machine learning, neural network.
The authors are with the Department of Computer Science, School of Computer, Tokyo Institute of Technology, Yokohama, 226-8503, Japan (e-mail: firstname.lastname@example.org, email@example.com).
Cite: Yusuke Yanagita and Masayuki Yamamura, "Gradient Masking Is a Type of Overfitting," International Journal of Machine Learning and Computing vol. 8, no. 3, pp. 203-207, 2018.