Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 2.6 KB

2412.02684.md

File metadata and controls

7 lines (5 loc) · 2.6 KB

AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction

Generating animatable human avatars from a single image is essential for various digital human modeling applications. Existing 3D reconstruction methods often struggle to capture fine details in animatable models, while generative approaches for controllable animation, though avoiding explicit 3D modeling, suffer from viewpoint inconsistencies in extreme poses and computational inefficiencies. In this paper, we address these challenges by leveraging the power of generative models to produce detailed multi-view canonical pose images, which help resolve ambiguities in animatable human reconstruction. We then propose a robust method for 3D reconstruction of inconsistent images, enabling real-time rendering during inference. Specifically, we adapt a transformer-based video generation model to generate multi-view canonical pose images and normal maps, pretraining on a large-scale video dataset to improve generalization. To handle view inconsistencies, we recast the reconstruction problem as a 4D task and introduce an efficient 3D modeling approach using 4D Gaussian Splatting. Experiments demonstrate that our method achieves photorealistic, real-time animation of 3D human avatars from in-the-wild images, showcasing its effectiveness and generalization capability.

从单张图像生成可动画的人类头像对数字人建模的各类应用至关重要。然而,现有的三维重建方法往往难以捕捉可动画模型中的细节,而基于生成方法的可控动画虽然避免了显式的三维建模,但在极端姿态下容易出现视角不一致性,并且计算效率较低。 为解决这些问题,本文利用生成模型的强大能力生成细致的多视角规范姿态图像,从而缓解可动画人类重建中的模糊性。接着,我们提出了一种针对不一致图像的鲁棒三维重建方法,在推理过程中实现实时渲染。具体而言,我们调整了一个基于Transformer的视频生成模型,用于生成多视角规范姿态图像和法线贴图,并在大规模视频数据集上进行预训练以提高模型的泛化能力。为解决视角不一致性问题,我们将重建问题重新表述为一个四维任务,并引入了基于**四维高斯散点(4D Gaussian Splatting)**的高效三维建模方法。 实验结果表明,本文方法能够从现实世界图像中生成真实感的三维人类头像动画,并支持实时渲染,展示了其卓越的效果和泛化能力,为单图像驱动的三维人类建模提供了一种创新且高效的解决方案。