SfM-Free 3D Gaussian Splatting via Hierarchical Training

Standard 3D Gaussian Splatting (3DGS) relies on known or pre-computed camera poses and a sparse point cloud, obtained from structure-from-motion (SfM) preprocessing, to initialize and grow 3D Gaussians. We propose a novel SfM-Free 3DGS (SFGS) method for video input, eliminating the need for known camera poses and SfM preprocessing. Our approach introduces a hierarchical training strategy that trains and merges multiple 3D Gaussian representations -- each optimized for specific scene regions -- into a single, unified 3DGS model representing the entire scene. To compensate for large camera motions, we leverage video frame interpolation models. Additionally, we incorporate multi-source supervision to reduce overfitting and enhance representation. Experimental results reveal that our approach significantly surpasses state-of-the-art SfM-free novel view synthesis methods. On the Tanks and Temples dataset, we improve PSNR by an average of 2.25dB, with a maximum gain of 3.72dB in the best scene. On the CO3D-V2 dataset, we achieve an average PSNR boost of 1.74dB, with a top gain of 3.90dB.

标准的三维高斯散点（3D Gaussian Splatting, 3DGS）依赖于已知或预计算的相机位姿以及通过结构化运动（SfM）预处理获得的稀疏点云，用于初始化和扩展3D高斯。我们提出了一种面向视频输入的全新SfM-Free 3DGS（SFGS）方法，消除了对已知相机位姿和SfM预处理的依赖。我们的方法引入了一种分层训练策略，通过训练和合并多个针对特定场景区域优化的3D高斯表示，生成一个统一的3DGS模型来表示整个场景。为应对大范围相机运动，我们利用了视频帧插值模型。此外，我们结合多源监督，降低过拟合风险并增强场景表示能力。实验结果表明，我们的方法显著优于当前最先进的无SfM新视角合成方法。在Tanks and Temples数据集上，我们的PSNR平均提升了2.25dB，单场景最高提升达3.72dB。在CO3D-V2数据集上，我们的平均PSNR提升了1.74dB，最大增幅达3.90dB。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2412.01553.md

2412.01553.md

SfM-Free 3D Gaussian Splatting via Hierarchical Training

Files

2412.01553.md

Latest commit

History

2412.01553.md

File metadata and controls

SfM-Free 3D Gaussian Splatting via Hierarchical Training