Loading…

Object Detection and Pose Estimation Based on Convolutional Neural Networks Trained with Synthetic Data

Instance-based object detection and fine pose estimation is an active research problem in computer vision. While the traditional interest-point-based approaches for pose estimation are precise, their applicability in robotic tasks relies on controlled environments and rigid objects with detailed tex...

Full description

Saved in:

Bibliographic Details
Main Authors:	Josifovski, Josip, Kerzel, Matthias, Pregizer, Christoph, Posniak, Lukas, Wermter, Stefan
Format:	Conference Proceeding
Language:	English
Subjects:	Data models Pose estimation Solid modeling Task analysis Three-dimensional displays Training Training data
Citations:	Items that cite this one
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Instance-based object detection and fine pose estimation is an active research problem in computer vision. While the traditional interest-point-based approaches for pose estimation are precise, their applicability in robotic tasks relies on controlled environments and rigid objects with detailed textures. CNN-based approaches, on the other hand, have shown impressive results in uncontrolled environments for more general object recognition tasks like category-based coarse pose estimation, but the need of large datasets of fully-annotated training images makes them unfavourable for tasks like instance-based pose estimation. We present a novel approach that combines the robustness of CNNs with a fine-resolution instance-based 3D pose estimation, where the model is trained with fully-annotated synthetic training data, generated automatically from the 3D models of the objects. We propose an experimental setup in which we can carefully examine how the model trained with synthetic data performs on real images of the objects. Results show that the proposed model can be trained only with synthetic renderings of the objects' 3D models and still be successfully applied on images of the real objects, with precision suitable for robotic tasks like object grasping. Based on the results, we present more general insights about training neural models with synthetic images for application on real-world images.
ISSN:	2153-0866
DOI:	10.1109/IROS.2018.8594379