Loading…

A Multi-Scale Grasp Detector Based on Fully Matching Model

Robotic grasping is an essential problem at both the household and industrial levels, and unstructured objects have always been difficult for grippers. Parallel-plate grippers and algorithms, focusing on partial information of objects, are one of the widely used approaches. However, most works predi...

Full description

Saved in:

Bibliographic Details
Published in:	Computer modeling in engineering & sciences 2022, Vol.133 (2), p.281-301
Main Authors:	Yuan, Xinheng, Yu, Hao, Zhang, Houlin, Zheng, Li, Dong, Erbao, Wu, Heng’an
Format:	Article
Language:	English
Subjects:	Algorithms Cameras Classification Color imagery Datasets Feature extraction Feature maps Frames per second Grasping (robotics) Grippers Model matching Pixels Rectangles Sensors
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Robotic grasping is an essential problem at both the household and industrial levels, and unstructured objects have always been difficult for grippers. Parallel-plate grippers and algorithms, focusing on partial information of objects, are one of the widely used approaches. However, most works predict single-size grasp rectangles for fixed cameras and gripper sizes. In this paper, a multi-scale grasp detector is proposed to predict grasp rectangles with different sizes on RGB-D or RGB images in real-time for hand-eye cameras and various parallel-plate grippers. The detector extracts feature maps of multiple scales and conducts predictions on each scale independently. To guarantee independence between scales and efficiency, fully matching model and background classifier are applied in the network. Based on analysis of the Cornell Grasp Dataset, the fully matching model can match all labeled grasp rectangles. Furthermore, background classification, along with angle classification and box regression, functions as hard negative mining and background predictor. The detector is trained and tested on the augmented dataset, which includes images of 320 × 320 pixels and grasp rectangles ranging from 20 to more than 320 pixels. It performs up to 98.87% accuracy on image-wise dataset and 97.83% on object-wise split dataset at a speed of more than 22 frames per second. In addition, the detector, which is trained on a single-object dataset, can predict grasps on multiple objects.
ISSN:	1526-1506 1526-1492 1526-1506
DOI:	10.32604/cmes.2022.021383