GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

1University of Science and Technology of China, 2Tsinghua University, 3Microsoft Research Asia


3D Gaussian Splatting (GS) have achieved considerable improvement over Neural Radiance Fields in terms of 3D fitting fidelity and rendering speed. However, this unstructured representation with scattered Gaussians poses a significant challenge for generative modeling. To address the problem, we introduce GaussianCube, a structured GS representation that is both powerful and efficient for generative modeling. We achieve this by first proposing a modified densification-constrained GS fitting algorithm which can yield high-quality fitting results using a fixed number of free Gaussians, and then re-arranging the Gaussians into a predefined voxel grid via Optimal Transport. The structured grid representation allows us to use standard 3D U-Net as our backbone in diffusion generative modeling without elaborate designs. Extensive experiments conducted on ShapeNet and OmniObject3D show that our model achieves state-of-the-art generation results both qualitatively and quantitatively, underscoring the potential of GaussianCube as a powerful and versatile 3D representation.


Our framework comprises two main stages of representation construction and 3D diffusion. In the representation construction stage, given multi-view renderings of a 3D asset, we perform densification-constrained fitting to obtain 3D Gaussians with constant numbers. Subsequently, the Gaussians are voxelized into GaussianCube via Optimal Transport. In the 3D diffusion stage, our 3D diffusion model is trained to generate GaussianCube from Gaussian noise.

Densification-constrained Fitting

First, we perform densification-constrained fitting to yield a fixed number of Gaussians. Specifically, if the current iteration comprises N_c Gaussians and N_d Gaussians need to be densified, we introduce a measure to prevent exceeding the predefined maximum of N_max Gaussians. This is achieved by selecting N_max - N_c Gaussians with the largest positional gradients from the N_d candidates for densification in cases where N_d > N_max - N_c. Otherwise, all N_d Gaussians are subjected to densification as in original Gaussian Splatting.

Gaussian voxelization via Optimal Transport

We then employ Optimal Transport to organize the resultant Gaussian into a predetermined voxel grid. Intuitively, we aim to "move" each Gaussian into a voxel grid while maintaining the geometry relations as much as possible. Therefore, we formulate this into an Optimal Transport Problem between the Gaussians' spatial positions and the centers of voxel grid.

Class-conditioned Generation Results


Unconditional Generation Results

ShapeNet Car

ShapeNet Chair


      title={GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling},
      author={Zhang, Bowen and Cheng, Yiji and Yang, Jiaolong and Wang, Chunyu and Zhao, Feng and Tang, Yansong and Chen, Dong and Guo, Baining},
      journal={arXiv preprint arXiv:2403.19655},