Authors: Paul Schulz (OVGU Magdeburg) Thorsten Hempel (OVGU Magdeburg) Ayoub Al-Hamadi (OVGU Magdeburg)
✅ End-to-end automated dataset generation pipeline for monocular 3D Detection/6D Pose Estimation
✅ Combines mesh creation via Neural Rendering and SoTA synthetic datset generation to create datasets for arbitrary complex objects
✅ Capable of training performant 6D pose estimation models
✅ Requires minimal resources and manual intervention
- Capturing 2D images of the target object using a rotating plate and a static camera
- Using Structure from Motion (SfM) for camera pose estimation
- Applying foreground extraction for object segmentation
- Training a Radiance Field to create meshes
- Refining meshes for high fidelity through vertex and face optimization
- Mapping of diffuse texture to generate textured mesh
- Creating 3D scenes with virtual cameras, lighting, and background variations
- Performing automated annotation (bounding boxes, segmentation, etc.)
- Generating diverse datasets for robust training of pose estimation models