Recently, Gaussian splatting has received more and more attention in the field of static scene rendering. Due to the low computational overhead and inherent flexibility of explicit representations, plane-based explicit methods are popular ways to predict deformations for Gaussian-based dynamic scene rendering models. However, plane-based methods rely on the inappropriate low-rank assumption and excessively decompose the space-time 4D encoding, resulting in overmuch feature overlap and unsatisfactory rendering quality. To tackle these problems, we propose Grid4D, a dynamic scene rendering model based on Gaussian splatting and employing a novel explicit encoding method for the 4D input through the hash encoding. Different from plane-based explicit representations, we decompose the 4D encoding into one spatial and three temporal 3D hash encodings without the low-rank assumption. Additionally, we design a novel attention module that generates the attention scores in a directional range to aggregate the spatial and temporal features. The directional attention enables Grid4D to more accurately fit the diverse deformations across distinct scene components based on the spatial encoded features. Moreover, to mitigate the inherent lack of smoothness in explicit representation methods, we introduce a smooth regularization term that keeps our model from the chaos of deformation prediction. Our experiments demonstrate that Grid4D significantly outperforms the state-of-the-art models in visual quality and rendering speed.
近期,高斯喷涂在静态场景渲染领域受到越来越多关注。由于显式表示的低计算开销和灵活性,基于平面的显式方法在高斯动态场景渲染模型中成为预测形变的流行选择。然而,平面方法依赖于不恰当的低秩假设,并过度分解时空四维编码,导致特征重叠过多,渲染质量不理想。 为了解决这些问题,我们提出了 Grid4D,一种基于高斯喷涂的动态场景渲染模型,采用了一种新颖的哈希编码显式方法处理四维输入。与基于平面的显式表示不同,我们将四维编码分解为一个空间和三个时间的三维哈希编码,避免了低秩假设。此外,我们设计了一种新的注意力模块,在特定方向范围内生成注意力得分,以聚合空间和时间特征。该方向性注意力使得 Grid4D 能够更准确地根据空间编码特征拟合不同场景组件的多样化形变。 此外,为了缓解显式表示方法固有的平滑性不足问题,我们引入了一个平滑正则项,防止模型在形变预测中出现混乱。实验结果表明,Grid4D 在视觉质量和渲染速度上显著优于现有最新模型。