Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 3.02 KB

2405.16822.md

File metadata and controls

7 lines (5 loc) · 3.02 KB

Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

Video generative models are receiving particular attention given their ability to generate realistic and imaginative frames. Besides, these models are also observed to exhibit strong 3D consistency, significantly enhancing their potential to act as world simulators. In this work, we present Vidu4D, a novel reconstruction model that excels in accurately reconstructing 4D (i.e., sequential 3D) representations from single generated videos, addressing challenges associated with non-rigidity and frame distortion. This capability is pivotal for creating high-fidelity virtual contents that maintain both spatial and temporal coherence. At the core of Vidu4D is our proposed Dynamic Gaussian Surfels (DGS) technique. DGS optimizes time-varying warping functions to transform Gaussian surfels (surface elements) from a static state to a dynamically warped state. This transformation enables a precise depiction of motion and deformation over time. To preserve the structural integrity of surface-aligned Gaussian surfels, we design the warped-state geometric regularization based on continuous warping fields for estimating normals. Additionally, we learn refinements on rotation and scaling parameters of Gaussian surfels, which greatly alleviates texture flickering during the warping process and enhances the capture of fine-grained appearance details. Vidu4D also contains a novel initialization state that provides a proper start for the warping fields in DGS. Equipping Vidu4D with an existing video generative model, the overall framework demonstrates high-fidelity text-to-4D generation in both appearance and geometry.

视频生成模型因其生成真实且富有想象力的帧的能力而备受关注。此外,这些模型还展示了较强的3D一致性,大大增强了其作为世界模拟器的潜力。在本研究中,我们提出了 Vidu4D,一种新颖的重建模型,能够从单一生成视频中精确重建4D(即顺序3D)表示,解决了非刚性和帧畸变相关的挑战。这一能力对于创建具有空间和时间一致性的高保真虚拟内容至关重要。 Vidu4D 的核心是我们提出的 动态高斯表面元素(Dynamic Gaussian Surfels, DGS) 技术。DGS 通过优化时变变形函数,将高斯表面元素(surfels)从静态状态转换为动态变形状态,从而实现对运动和形变的精确描绘。为保持表面对齐的高斯表面元素的结构完整性,我们设计了基于连续变形场的几何正则化,用于估计法向。此外,我们学习了高斯表面元素在旋转和缩放参数上的精细调整,大大减轻了变形过程中的纹理闪烁问题,并增强了对细致外观细节的捕捉能力。 Vidu4D 还包含一种新颖的初始化状态,为 DGS 的变形场提供了合理的起点。结合现有的视频生成模型,整体框架在外观和几何方面展示了高保真的文本到4D生成能力。这一成果为生成一致且精确的4D内容开辟了新的路径。