Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.59 KB

2407.15484.md

File metadata and controls

5 lines (3 loc) · 2.59 KB

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

We propose 6DGS to estimate the camera pose of a target RGB image given a 3D Gaussian Splatting (3DGS) model representing the scene. 6DGS avoids the iterative process typical of analysis-by-synthesis methods (e.g. iNeRF) that also require an initialization of the camera pose in order to converge. Instead, our method estimates a 6DoF pose by inverting the 3DGS rendering process. Starting from the object surface, we define a radiant Ellicell that uniformly generates rays departing from each ellipsoid that parameterize the 3DGS model. Each Ellicell ray is associated with the rendering parameters of each ellipsoid, which in turn is used to obtain the best bindings between the target image pixels and the cast rays. These pixel-ray bindings are then ranked to select the best scoring bundle of rays, which their intersection provides the camera center and, in turn, the camera rotation. The proposed solution obviates the necessity of an "a priori" pose for initialization, and it solves 6DoF pose estimation in closed form, without the need for iterations. Moreover, compared to the existing Novel View Synthesis (NVS) baselines for pose estimation, 6DGS can improve the overall average rotational accuracy by 12% and translation accuracy by 22% on real scenes, despite not requiring any initialization pose. At the same time, our method operates near real-time, reaching 15fps on consumer hardware.

我们提出了6DGS,用于在给定表示场景的3D高斯喷溅(3DGS)模型的情况下估计目标RGB图像的相机姿态。6DGS避免了分析合成方法(如iNeRF)典型的迭代过程,这些方法还需要对相机姿态进行初始化才能收敛。相反,我们的方法通过反转3DGS渲染过程来估计6自由度姿态。从物体表面开始,我们定义了一个辐射椭胞体(Ellicell),它均匀地生成从参数化3DGS模型的每个椭球体出发的射线。每个椭胞体射线与每个椭球体的渲染参数相关联,这反过来用于获得目标图像像素和投射射线之间的最佳绑定。然后对这些像素-射线绑定进行排序,以选择得分最高的射线束,其交点提供相机中心,进而确定相机旋转。提出的解决方案避免了需要"先验"姿态进行初始化,并以闭式形式解决6自由度姿态估计,无需迭代。此外,与用于姿态估计的现有新视角合成(NVS)基准相比,6DGS在真实场景中可以将整体平均旋转精度提高12%,平移精度提高22%,尽管不需要任何初始化姿态。同时,我们的方法接近实时运行,在消费级硬件上达到15fps。