Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 2.49 KB

2502.11642.md

File metadata and controls

7 lines (5 loc) · 2.49 KB

GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text

In this paper, we introduce GaussianMotion, a novel human rendering model that generates fully animatable scenes aligned with textual descriptions using Gaussian Splatting. Although existing methods achieve reasonable text-to-3D generation of human bodies using various 3D representations, they often face limitations in fidelity and efficiency, or primarily focus on static models with limited pose control. In contrast, our method generates fully animatable 3D avatars by combining deformable 3D Gaussian Splatting with text-to-3D score distillation, achieving high fidelity and efficient rendering for arbitrary poses. By densely generating diverse random poses during optimization, our deformable 3D human model learns to capture a wide range of natural motions distilled from a pose-conditioned diffusion model in an end-to-end manner. Furthermore, we propose Adaptive Score Distillation that effectively balances realistic detail and smoothness to achieve optimal 3D results. Experimental results demonstrate that our approach outperforms existing baselines by producing high-quality textures in both static and animated results, and by generating diverse 3D human models from various textual inputs.

在本文中,我们提出了一种新型的渲染模型——GaussianMotion,该模型能够生成完全可动画化的场景,并与文本描述对齐,采用高斯溅射(Gaussian Splatting)技术。尽管现有方法能够实现合理的文本到3D的人体生成,利用多种3D表示方式,但这些方法通常在保真度和效率上存在限制,或者主要集中于静态模型,并且姿势控制有限。与之不同,我们的方法通过结合可变形的3D高斯溅射和文本到3D得分蒸馏(score distillation),生成完全可动画化的3D头像,实现了对任意姿势的高保真和高效渲染。 在优化过程中,我们通过密集地生成多种随机姿势,使得我们的可变形3D人体模型能够捕捉来自姿势条件扩散模型(pose-conditioned diffusion model)的广泛自然运动,并以端到端的方式进行蒸馏。此外,我们还提出了自适应得分蒸馏,该方法有效地平衡了真实细节和流畅度,从而实现了最佳的3D效果。 实验结果表明,我们的方法在静态和动态结果上都优于现有基线,通过生成高质量的纹理以及从不同文本输入中生成多样化的3D人体模型,展示了显著的改进。