Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Project Page | Paper

Yuxuan Xue¹, Xianghui Xie^{1, 2}, Riccardo Marin¹, Gerard Pons-Moll^{1, 2}

¹Real Virtual Human Group @ University of Tübingen & Tübingen AI Center
²Max Planck Institute for Informatics, Saarland Informatics Campus

News 🚩

[2024/12/9] Inference Code release.
[2024/12/9] Gen-3Diffusion paper is available on Arxiv.

Key Insight 🙌

2D foundation models are powerful but output lacks 3D consistency!
3D generative models can reconstruct 3D representation but is poor in generalization!
How to combine 2D foundation models with 3D generative models?:
- they are both diffusion-based generative models => Can be synchronized at each diffusion step
- 2D foundation model helps 3D generation => provides strong prior informations about 3D shape
- 3D representation guides 2D diffusion sampling => use rendered output from 3D reconstruction for reverse sampling, where 3D consistency is guaranteed

Difference to Human-3Diffusion

We extend the joint 2D-3D diffusion idea on daily objects reconstruction
We adopt relative camera system in Gen-3Diffusion, because the front-view of objects has ambiguity. Human have clear front-view, and we used absolute camera system in Human-3Diffusion.

Install

Same Conda environment to Human-3Diffusion. Please skip if you already installed it.

# Conda environment
conda create -n gen3diffusion python=3.10
conda activate gen3diffusion
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install xformers==0.0.22.post4 --index-url https://download.pytorch.org/whl/cu121

# Gaussian Opacity Fields
git clone https://github.com/YuxuanSnow/gaussian-opacity-fields.git
cd gaussian-opacity-fields && pip install submodules/diff-gaussian-rasterization
pip install submodules/simple-knn/ && cd ..
export CPATH=/usr/local/cuda-12.1/targets/x86_64-linux/include:$CPATH

# Dependencies
pip install -r requirements.txt

# TSDF Fusion (Mesh extraction) Dependencies
pip install --user numpy opencv-python scikit-image numba
pip install --user pycuda
pip install scipy==1.11

Pretrained Weights

Our pretrained weight can be downloaded from huggingface.

mkdir checkpoints_obj && cd checkpoints_obj
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model.safetensors
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model_1.safetensors
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/pifuhd.pt
cd ..

The avatar reconstruction module is same to Human-3Diffusion. Please skip if you already installed Human-3Diffusion.

mkdir checkpoints_avatar && cd checkpoints_avatar
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model.safetensors
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model_1.safetensors
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/pifuhd.pt
cd ..

Inference

# given one image of object, generate 3D-GS object
# subject should be centered in a square image, please crop properly 
# recenter plays a huge role in object reconstruction. Please adjust the recentering if the reconstruction doesn't work well
python infer.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj

# given generated 3D-GS, perform TSDF mesh extraction
python infer_mesh.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj --mesh_quality high

# given one image of human, generate 3D-GS avatar
# subject should be centered in a square image, please crop properly
python infer.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar

# given generated 3D-GS, perform TSDF mesh extraction
python infer_mesh.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar --mesh_quality high

Training

# Object training
# render multiview RGB images from object mesh (required Blender 4.1.1)
blender-4.1.1-linux-x64/blender -b -P render_bpy_objaverse.py -- --subject 000-001/112c059282cf4511a01fd27211edcae8

# optional, pretrain Multiview Diffusion (if your data is very different to pretrained distribution)
accelerate launch train_MultiviewDiffusion_diffusion.py

# optional, pretrain Multiview Reconstruction with Diffusion (if your data is very different to pretrained distribution)
accelerate launch train_MultiviewReconstructor_diffusion.py

# Core training script of joint 2D and 3D diffusion training.
accelerate launch train_MVDMVR_joint.py

# Avatar training
# please refer to https://github.com/YuxuanSnow/Human3Diffusion

Citation ✍️

@inproceedings{xue2024gen3diffusion,
  title     = {{Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy }},
  author    = {Xue, Yuxuan and Xie, Xianghui and Marin, Riccardo and Pons-Moll, Gerard.},
  journal   = {Arxiv},
  year      = {2024},
}

@inproceedings{xue2024human3diffusion,
  title     = {{Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models}},
  author    = {Xue, Yuxuan and Xie, Xianghui and Marin, Riccardo and Pons-Moll, Gerard.},
  journal   = {NeurIPS 2024},
  year      = {2024},
}

Name	Name	Last commit message	Last commit date
Latest commit YuxuanSnow Update README.md Jan 13, 2025 c3bbe03 · Jan 13, 2025 History 4 Commits
assets	assets	Init Commit	Dec 9, 2024
core	core	update training scripts	Jan 13, 2025
mvdream	mvdream	Init Commit	Dec 9, 2024
test_imgs_avatar	test_imgs_avatar	Init Commit	Dec 9, 2024
test_imgs_obj	test_imgs_obj	Init Commit	Dec 9, 2024
.gitignore	.gitignore	Init Commit	Dec 9, 2024
LICENSE	LICENSE	Init Commit	Dec 9, 2024
README.md	README.md	Update README.md	Jan 13, 2025
infer.py	infer.py	Init Commit	Dec 9, 2024
infer_mesh.py	infer_mesh.py	Init Commit	Dec 9, 2024
render_bpy_objaverse.py	render_bpy_objaverse.py	update training scripts	Jan 13, 2025
requirements.txt	requirements.txt	Init Commit	Dec 9, 2024
train_MVDMVR_joint.py	train_MVDMVR_joint.py	update training scripts	Jan 13, 2025
train_MultiviewDiffusion_diffusion.py	train_MultiviewDiffusion_diffusion.py	update training scripts	Jan 13, 2025
train_MultiviewReconstructor_diffusion.py	train_MultiviewReconstructor_diffusion.py	update training scripts	Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Project Page | Paper

News 🚩

Key Insight 🙌

Difference to Human-3Diffusion

Install

Pretrained Weights

Inference

Training

Citation ✍️

About

Releases

Packages

Languages

License

YuxuanSnow/Gen3Diffusion

Folders and files

Latest commit

History

Repository files navigation

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Project Page | Paper

News 🚩

Key Insight 🙌

Difference to Human-3Diffusion

Install

Pretrained Weights

Inference

Training

Citation ✍️

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages