Skip to content

Latest commit

 

History

History
197 lines (143 loc) · 6.38 KB

readme.md

File metadata and controls

197 lines (143 loc) · 6.38 KB

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

[CVPR 2024 - Highlight]

Paper PDF

Tao Tang · Guangrun Wang · Yixing Lao · Peng Chen · Jie Liu · Liang Lin · Kaicheng Yu · Xiaodan Liang alignmif alignmif

This work thoroughly investigated and validated the misalignment issue in multimodal NeRF and propose AlignMiF, a geometrically aligned multimodal implicit field.

  1. We perform comprehensive analyses of multimodal learning in NeRF, identifying the modality misalignment issue.
  2. We propose AlignMiF, with GAA and SGI modules, to address the misalignment issue by aligning the consistent coarse geometry of different modalities while preserving their unique details.
  3. We demonstrate the effectiveness of our method quantitatively and qualitatively through extensive experiments conducted on multiple datasets and scenes.

Note

The current code is not fully ready yet. I have removed many other attempts or code belonging to future projects from the current codebase. However, there might still be some areas that haven't been properly organized or cleaned, which could potentially result in errors. Please take note of this when using the code.

Dataset Preprocess

KITTI-360 dataset

Prepare KITTI-360 dataset:

# Follow: https://www.cvlibs.net/datasets/kitti-360/documentation.php
$ mkdir -p data/kitti360
$ ln -s ${HOME}/data/KITTI-360 data/kitti360/KITTI-360

Run KITTI-360 dataset preprocessing:

# Currently, the scene and frame ids are hard-coded in the script.
# Generate train range images
python preprocess/generate_train_rangeview.py --dataset kitti360
# Generate jsons
python preprocess/kitti360_to_nerf.py
# Calculate centerpose (optional) can directly use our config
python preprocess/cal_centerpose_bound.py

Waymo dataset

Prepare waymo dataset:

$ mkdir -p data/waymo/waymo_v120
$ ln -s ${HOME}/data/waymo data/waymo

Run waymo dataset preprocessing:

# Currently, the scene and frame ids are hard-coded in the script.
# Extract images and lidar with pose from tfrecords
python preprocess/waymo_extract_from_tfrecord.py
# Generate train range images
python preprocess/generate_train_rangeview.py --dataset waymo
# Generate jsons
python preprocess/waymo_to_nerf.py
# Calculate centerpose (optional) can directly use our config
python preprocess/cal_centerpose_bound.py --dataset waymo

Dependencies

conda create -n ngp python=3.8
conda activate ngp

# Torch
pip install -r requirements_torch.txt

# Others
pip install -r requirements.txt

# tiny-cuda-nn (option 1)
# First, make sure CUDA version is the same CUDA as the one compiled PyTorch
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

# tiny-cuda-nn (option 2)
# Useful when github connection is slow
export CUDA_HOME=/usr/local/cuda
git clone https://github.com/NVlabs/tiny-cuda-nn.git
cd tiny-cuda-nn
git submodule update --init --recursive
cd bindings/torch
python setup.py install

# Camtools (assumes camtools are in ../camtools)
pushd ../camtools && pip install -e . && popd

# ChamferDistance
git submodule update --init --recursive

# Build extension (optional, not use tcnn)
pip install lidarnerf/raymarching
pip install lidarnerf/gridencoder
pip install lidarnerf/shencoder
pip install lidarnerf/freqencoder


# Install lidarnerf
pip install -e .

# Verify installation
python -c "import lidarnerf; print(lidarnerf.__version__)"

Run

# single modality
python main_alignmif.py -L --workspace kitti360-1908/lidar --enable_lidar --config configs/kitti360_1908.txt
python main_alignmif.py -L --workspace kitti360-1908/rgb --enable_rgb --config configs/kitti360_1908.txt
# multimodality implicit fusion
python main_alignmif.py -L --workspace kitti360-1908/mif --enable_rgb --enable_lidar --config configs/kitti360_1908.txt --network mif
# aligmif 
## The SGI moudle employ a pre-trained lidar-nerf model as initialization
## 1.First train a lidar-nerf model
python main_alignmif.py -L --workspace kitti360-1908/lidar --enable_lidar --config configs/kitti360_1908.txt
## 2.Then train alignmif
python main_alignmif.py -L --workspace kitti360-1908/alignmif --enable_lidar --enable_rgb --config configs/kitti360_1908.txt --ckpt kitti360-1908/lidar/checkpoints/alignmif_ep0500.pth --activate_levels 8 --network alignmif

Pre-trained Models

You can download our pre-trained models here (TODO).

Citation

If you find our code or paper helps, please consider citing:

@article{tang2024alignmif,
  title={AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis},
  author={Tang, Tao and Wang, Guangrun and Lao, Yixing and Chen, Peng and Liu, Jie and Lin, Liang and Yu, Kaicheng and Liang, Xiaodan},
  journal={arXiv preprint arXiv:2402.17483},
  year={2024}
}

Acknowledgments

This code is built on top of the super-useful lidar-nerf and torch-ngp implementation.

@article{tao2023lidar,
  title={LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields},
  author={Tao, Tang and Gao, Longfei and Wang, Guangrun and Lao, Yixing and Chen, Peng and Zhao hengshuang and Hao, Dayang and Liang, Xiaodan and Salzmann, Mathieu and Yu, Kaicheng},
  journal={arXiv preprint arXiv:2304.10406},
  year={2023}
}
@misc{torch-ngp,
    Author = {Jiaxiang Tang},
    Year = {2022},
    Note = {https://github.com/ashawkey/torch-ngp},
    Title = {Torch-ngp: a PyTorch implementation of instant-ngp}
}