In-situ identification and recognition of multi-hand gestures using optimized deep residual network

The real-time perception of hand gestures in a deprived environment is a demanding machine vision task. The hand recognition operations are more strenuous with different illumination conditions and varying backgrounds. Robust recognition and classification are the vital steps to support effective hu...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of intelligent & fuzzy systems 2021-01, Vol.41 (6), p.6983-6997
Main Authors:	Rubin Bose, S., Sathiesh Kumar, V.
Format:	Article
Language:	eng
Subjects:	Activity recognition Algorithms Computer networks Datasets Feature extraction Gesture recognition Machine vision Performance measurement Real time Virtual reality
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The real-time perception of hand gestures in a deprived environment is a demanding machine vision task. The hand recognition operations are more strenuous with different illumination conditions and varying backgrounds. Robust recognition and classification are the vital steps to support effective human-machine interaction (HMI), virtual reality, etc. In this paper, the real-time hand action recognition is performed by using an optimized Deep Residual Network model. It incorporates a RetinaNet model for hand detection and a Depthwise Separable Convolutional (DSC) layer for precise hand gesture recognition. The proposed model overcomes the class imbalance problems encountered by the conventional single-stage hand action recognition algorithms. The integrated DSC layer reduces the computational parameters and enhances the recognition speed. The model utilizes a ResNet-101 CNN architecture as a Feature extractor. The model is trained and evaluated on the MITI-HD dataset and compared with the benchmark datasets (NUSHP-II, Senz-3D). The network achieved a higher Precision and Recall value for an IoU value of 0.5. It is realized that the RetinaNet-DSC model using ResNet-101 backbone network obtained higher Precision (99.21 %for AP0.5, 96.80%for AP0.75) for MITI-HD Dataset. Higher performance metrics are obtained for a value of γ= 2 and α= 0.25. The SGD with a momentum optimizer outperformed the other optimizers (Adam, RMSprop) for the datasets considered in the studies. The prediction time of the optimized deep residual network is 82 ms.
ISSN:	1064-1246 1875-8967