Skip to content

Latest commit

 

History

History
8 lines (6 loc) · 3.15 KB

2412.20720.md

File metadata and controls

8 lines (6 loc) · 3.15 KB

4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives

Dynamic 3D scene representation and novel view synthesis from captured videos are crucial for enabling immersive experiences required by AR/VR and metaverse applications. However, this task is challenging due to the complexity of unconstrained real-world scenes and their temporal dynamics. In this paper, we frame dynamic scenes as a spatio-temporal 4D volume learning problem, offering a native explicit reformulation with minimal assumptions about motion, which serves as a versatile dynamic scene learning framework. Specifically, we represent a target dynamic scene using a collection of 4D Gaussian primitives with explicit geometry and appearance features, dubbed as 4D Gaussian splatting (4DGS). This approach can capture relevant information in space and time by fitting the underlying spatio-temporal volume. Modeling the spacetime as a whole with 4D Gaussians parameterized by anisotropic ellipses that can rotate arbitrarily in space and time, our model can naturally learn view-dependent and time-evolved appearance with 4D spherindrical harmonics. Notably, our 4DGS model is the first solution that supports real-time rendering of high-resolution, photorealistic novel views for complex dynamic scenes. To enhance efficiency, we derive several compact variants that effectively reduce memory footprint and mitigate the risk of overfitting. Extensive experiments validate the superiority of 4DGS in terms of visual quality and efficiency across a range of dynamic scene-related tasks (e.g., novel view synthesis, 4D generation, scene understanding) and scenarios (e.g., single object, indoor scenes, driving environments, synthetic and real data).

动态三维场景表示与新视角合成对于增强AR/VR和元宇宙应用中的沉浸式体验至关重要。然而,由于不受约束的真实场景的复杂性及其时间动态特性,这一任务面临巨大挑战。本文将动态场景表述为一个时空4D体积学习问题,提供了一种原生的显式重构方式,几乎不对运动做任何假设,从而构建了一个通用的动态场景学习框架。 具体而言,我们使用具有显式几何和外观特征的4D高斯原语集合表示目标动态场景,这种方法称为四维高斯散射(4D Gaussian Splatting, 4DGS)。通过拟合场景的底层时空体积,该方法能够捕获空间和时间中的相关信息。我们通过各向异性椭圆参数化的4D高斯,将时空整体建模,使其可以在空间和时间中任意旋转,从而自然地学习视角依赖和随时间变化的外观,并结合**四维球柱谐波(4D Spherindrical Harmonics)**实现高级特性建模。 值得注意的是,4DGS模型是首个支持实时渲染高分辨率、照片级真实感动态场景新视角的解决方案。为提升效率,我们还提出了多个紧凑变体,有效减少内存占用并降低过拟合风险。 大量实验验证了4DGS在视觉质量和效率上的卓越性能,适用于多种动态场景相关任务(如新视角合成、4D生成、场景理解)和场景类型(如单一对象、室内场景、驾驶环境、合成与真实数据)。