Abstract—Recently, a number of methods for dynamic hand gesture recognition has been proposed. However, deployment of such methods in a practical application still has to face with many challenges due to the variation of view point, complex background or subject style. In this work, we deeply investigate performance of advanced convolutional neural networks for a specific case of hand gestures and evaluate how robust it is to above variations. To this end, we adopt an existing 3D convolutional neural network which was originally proposed for general human action recognition and obtained very competitive accuracy. We extend it to two-streams architecture (RGB and optical flow) and apply transfer learning on our dataset of hand gestures. To evaluate the robustness of the method, we design carefully a multi-view dataset that composes of five dynamic hand gestures in indoor environment with complex background. Experiments with single or cross view on this dataset show that background and viewpoint has strong impact on recognition robustness. In addition, the network’s performances are mostly increased by multi-modality combinations and fine-tuning strategy. This analysis helps to make
Index Terms—Deep learning, convolutional neural network, dynamic hand gestures, optical flow, multi-view.
Dang-Manh Truong was with Hanoi University of Science Technology, Vietnam (e-mail: email@example.com).
Huong-Giang Doan is with Electrical Power University Hanoi, Vietnam (e-mail: firstname.lastname@example.org).
Thanh-Hai Tran, Hai Vu, and Thi-Lan Le are with Hanoi University of Science Technology (Corresponding author: Thanh-Hai Tran; e-mail: email@example.com, firstname.lastname@example.org, Thi-Lan.Le@mica.edu.vn).
Cite: Dang-Manh Truong, Huong-Giang Doan, Thanh-Hai Tran, Hai Vu, and Thi-Lan Le, "Robustness Analysis of 3D Convolutional Neural Network for Human Hand Gesture Recognition," International Journal of Machine Learning and Computing vol. 9, no. 2, pp. 135-142, 2019.