Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration
Yuxiao Wang, Wenpeng Neng, Zhenao Wei, Yu Lei, WeiYing Xue, Nan Zhuang, Yanwu Xu, Xinyu Jiang, Qi Liu
Please first install the follow environment:
- python 3.8
- pytorch 1.11.0 (cu113)
- torchvision 0.12.0 (cu113)
pip3 install -r requirements.txt
- Data: download the HOT dataset from the project website and unzip to
/path/to/dataset
. Then:
cd ./data
mkdir HOT
ln -s /path/to/dataset ./data/HOT
The directory structure is as follows:
Project/
├── data/
| |── HOT
| | |── HOT-Annotated
| | | |── images
| | | |── annotations
| | | |── ...
| | |── HOT-Generated
| | | |── images
| | | |── annotations
| | | |── ...
│ ├── hot_train.odgt
│ ├── hot_test.odgt
│ ├── ...
We use the LaMa model in combination with the human mask to reconstruct the occluded object information.
Please install the environment according to the LaMa official library and move the infer.py in the scripts folder of this project to the LaMa project folder and run it to save the repaired image.
In order to facilitate researchers to get started quickly, we also provide an external link to the download inpainting images. Just download it and unzip it to the ./data/HOT/HOT-Annotated(HOT-Generated)/inpainting
directory. Please click Inpainting Images to download it.
ZoeDepth is used to generate depth map. LaMa model in combination with the human mask to reconstruct the occluded object information.
Developers need to install the environment according to the official instructions of ZoeDepth and save the generated depth map to the ./data/HOT/HOT-Annotated(HOT-Generated)/depth
directory.
Please note that in order to keep the original image and the inpainting image at the same perspective, they need to be spliced together and sent to the ZoeDepth model.
In order to facilitate researchers to get started quickly, we also provide an external link to the depth map. Just download it and unzip it to the ./data/HOT/HOT-Annotated(HOT-Generated)/depth
directory. Please click Depth Map to download it.
python train.py --gpus 0,1,2,3 --cfg config/hot-resnet50dilated-c1.yaml
To choose which gpus to use, you can either do --gpus 0-7
, or --gpus 0,2,4,6
.
You can change the parameters in config/hot-resnet50dilated-c1.yaml
to adjust the network training process.
Use the following code to evaluate the verification effect of all epochs.
sh ./val.sh
If you want to evaluate the effect of the test set on a specific epoch, you can use the following command:
sh ./test.sh
Note, remember to change the --epoch
parameter.
After evaluating the model, use the following command to view the results. Note that this command will display the results of all epochs. If you evaluate the validation set first and then evaluate the test set of the specified epoch, the results displayed are the results of the validation set except for the specified test epoch.
sh ./show_loss.sh
If you want to visualize the experimental results of the test set, use the following command:
sh ./vis_results.sh
Note, remember to change the --epoch
parameter.
@article{wang2024precision,
title={Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration},
author={Wang, Yuxiao and Neng, Wenpeng and Wei, Zhenao and Lei, Yu and Xue, Weiying and Zhuang, Nan and Xu, Yanwu and Jiang, Xinyu and Liu, Qi},
journal={arXiv preprint arXiv:2412.09920},
year={2024}
}
For the HOT model and dataset proposed by Chen et al., please click HOT for details.