-
Notifications
You must be signed in to change notification settings - Fork 42
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature] Add 4d association method (#16)
* add 4dag method * rename fourd to bottom up, add metric and alignment, change name manner * rename some files and variables * fix a bug * delete Camera class in association * change configs * modify triangulation pipeline * convert data format form pkl to json * align triangulate method, align kps convention * delete fourdag triangulate, add fourdag19 convention pipeline * change some names * resolve conflict * resolve some comments * run pre-commit * align result * refactor code * refactor triangulator and optimization * resolve comments * pre-commit * change limb info * fix bug * add stringdoc * rename joint to kps * add yapf * pre-commit * add docstring * refactor * add limb info json file * fix bug * delete debug call * add readme * rephase term class * rephase associate * add cloud file and path * process fourdag seq5 * resolve comments
- Loading branch information
Showing
42 changed files
with
4,101 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras | ||
Note: As a python variable name cannot start with a number, we refer to this method as `FourDAG` in the following text and code. | ||
|
||
- [Introduction](#introduction) | ||
- [Prepare limb information and datasets](#prepare-limb-information-and-datasets) | ||
- [Results](#results) | ||
- [Campus](#campus) | ||
- [Shelf](#shelf) | ||
- [FourDAG](#fourdag-1) | ||
|
||
## Introduction | ||
|
||
We provide the config files for FourDAG: [4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras](https://arxiv.org/abs/2002.12625). | ||
|
||
|
||
[Official Implementation](https://github.com/zhangyux15/4d_association) | ||
|
||
```BibTeX | ||
@inproceedings{Zhang20204DAG, | ||
title={4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras}, | ||
author={Yuxiang Zhang and Liang An and Tao Yu and Xiu Li and Kun Li and Yebin Liu}, | ||
journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition}, | ||
year={2020}, | ||
pages={1321-1330} | ||
} | ||
``` | ||
## Prepare limb information and datasets | ||
|
||
- **Prepare limb information**: | ||
|
||
``` | ||
sh scripts/download_weight.sh | ||
``` | ||
You could find perception models in `weight` file. | ||
|
||
- **Prepare the datasets**: | ||
|
||
You could download Shelf, Campus or FourDAG datasets, and convert original dataset to our unified meta-data. Considering that it takes long to run a converter, we have done it for you. Please download compressed zip file for converted meta-data from [here](../../docs/en/dataset_preparation.md), and place meta-data under `ROOT/xrmocap_data/DATASET`. | ||
|
||
The final file structure would be like: | ||
|
||
```text | ||
xrmocap | ||
├── xrmocap | ||
├── docs | ||
├── tools | ||
├── configs | ||
├── weight | ||
| └── limb_info.json | ||
└── xrmocap_data | ||
├── CampusSeq1 | ||
├── Shelf | ||
| ├── Camera0 | ||
| ├── ... | ||
| ├── Camera4 | ||
| └── xrmocap_meta_testset | ||
└── FourDAG | ||
├── seq2 | ||
├── seq4 | ||
├── seq5 | ||
├── xrmocap_meta_seq2 | ||
├── xrmocap_meta_seq4 | ||
└── xrmocap_meta_seq5 | ||
``` | ||
You can download just one dataset of Shelf, Campus and FourDAG. | ||
|
||
## Results | ||
|
||
We evaluate FourDAG on 3 benchmarks, report the Percentage of Correct Parts (PCP) on Shelf/Campus/FourDAG datasets. | ||
|
||
You can find the recommended configs in `configs/foudage/*/eval_keypoints3d.py`. | ||
|
||
|
||
### Campus | ||
|
||
The 2D keypoints and pafs data we use is generated by openpose, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data). | ||
|
||
| Config | Actor 0 | Actor 1 | Actor 2 | Average | Download | | ||
|:------:|:-------:|:--------:|:--------:|:--------:|:--------:| | ||
| [eval_keypoints3d.py](./campus_config/eval_keypoints3d.py) | 64.26 | 90.64 | 86.27 | 80.39 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/campus.zip) | | ||
|
||
|
||
### Shelf | ||
|
||
The 2D keypoints and pafs data we use is generated by fasterrcnn, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data). | ||
|
||
| Config | Actor 0 | Actor 1 | Actor 2 | Average | Download | | ||
|:------:|:-------:|:--------:|:--------:|:--------:|:--------:| | ||
| [eval_keypoints3d.py](./shelf_config/eval_keypoints3d.py) | 99.61 | 96.76 | 98.20 | 98.19 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/shelf.zip) | | ||
|
||
|
||
### FourDAG | ||
|
||
The 2D keypoints and pafs data we use is generated by mmpose, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data). | ||
|
||
- **seq2** | ||
|
||
| Config | Actor 0 | Actor 1 | Average | PCK@200mm | Download | | ||
|:-------:|:--------:|:--------:|:--------:|:--------:|:--------:| | ||
| [eval_keypoints3d.py](./fourdag_config/eval_keypoints3d_seq2.py) | 92.18 | 87.35 | 89.77 | 83.10 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/fourdag.zip) | | ||
|
||
- **seq4** | ||
|
||
| Config | Actor 0 | Actor 1 | Actor 1 | Average | PCK@200mm | Download | | ||
|:-------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | ||
| [eval_keypoints3d.py](./fourdag_config/eval_keypoints3d_seq4.py) | 91.85 | 86.48 | 92.92 | 90.42 | 81.29 |[log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/fourdag.zip) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
type = 'BottomUpAssociationEvaluation' | ||
|
||
__data_root__ = './xrmocap_data/CampusSeq1' | ||
__meta_path__ = __data_root__ + '/xrmocap_meta_testset' | ||
|
||
logger = None | ||
output_dir = './output/fourdag/CampusSeq1_fourdag_19_FourDAGOptimization/' | ||
pred_kps3d_convention = 'fourdag_19' | ||
eval_kps3d_convention = 'campus' | ||
selected_limbs_name = [ | ||
'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm', | ||
'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh' | ||
] | ||
additional_limbs_names = [['jaw', 'headtop']] | ||
|
||
associator = dict( | ||
type='FourDAGAssociator', | ||
kps_convention=pred_kps3d_convention, | ||
min_asgn_cnt=10, | ||
use_tracking_edges=True, | ||
keypoints3d_optimizer=dict( | ||
type='FourDAGOptimizer', | ||
triangulator=dict(type='JacobiTriangulator', ), | ||
active_rate=0.5, | ||
min_track_cnt=20, | ||
bone_capacity=30, | ||
w_bone3d=1.0, | ||
w_square_shape=1e-3, | ||
shape_max_iter=5, | ||
w_kps3d=1.0, | ||
w_regular_pose=1e-4, | ||
pose_max_iter=20, | ||
w_kps2d=1e-5, | ||
w_temporal_trans=1e-1 / pow(512 / 2048, 2), | ||
w_temporal_pose=1e-1 / pow(512 / 2048, 2), | ||
min_triangulate_cnt=15, | ||
init_active=0.9, | ||
triangulate_thresh=0.05, | ||
logger=logger, | ||
), | ||
graph_construct=dict( | ||
type='GraphConstruct', | ||
kps_convention=pred_kps3d_convention, | ||
max_epi_dist=0.15, | ||
max_temp_dist=0.2, | ||
normalize_edges=True, | ||
logger=logger, | ||
), | ||
graph_associate=dict( | ||
type='GraphAssociate', | ||
kps_convention=pred_kps3d_convention, | ||
w_epi=2, | ||
w_temp=2, | ||
w_view=2, | ||
w_paf=4, | ||
w_hier=0.5, | ||
c_view_cnt=1.5, | ||
min_check_cnt=1, | ||
logger=logger, | ||
), | ||
logger=logger, | ||
) | ||
|
||
dataset = dict( | ||
type='BottomUpMviewMpersonDataset', | ||
data_root=__data_root__, | ||
img_pipeline=[ | ||
dict(type='LoadImagePIL'), | ||
dict(type='ToTensor'), | ||
], | ||
meta_path=__meta_path__, | ||
test_mode=True, | ||
shuffled=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
gt_kps3d_convention='campus', | ||
cam_world2cam=True, | ||
) | ||
|
||
dataset_visualization = dict( | ||
type='MviewMpersonDataVisualization', | ||
data_root=__data_root__, | ||
output_dir=output_dir, | ||
meta_path=__meta_path__, | ||
pred_kps3d_paths=None, | ||
vis_percep2d=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
vis_gt_kps3d=False, | ||
vis_bottom_up=True, | ||
gt_kps3d_convention=None, | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
type = 'BottomUpAssociationEvaluation' | ||
|
||
__data_root__ = './xrmocap_data/FourDAG/' | ||
__meta_path__ = __data_root__ + '/xrmocap_meta_seq2' | ||
|
||
logger = None | ||
output_dir = './output/fourdag/fourdag_fourdag_19_FourDAGOptimization/' | ||
pred_kps3d_convention = 'fourdag_19' | ||
eval_kps3d_convention = 'campus' | ||
selected_limbs_name = [ | ||
'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm', | ||
'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh' | ||
] | ||
# additional_limbs_names = [['jaw', 'headtop']] | ||
|
||
associator = dict( | ||
type='FourDAGAssociator', | ||
kps_convention=pred_kps3d_convention, | ||
min_asgn_cnt=5, | ||
use_tracking_edges=True, | ||
keypoints3d_optimizer=dict( | ||
type='FourDAGOptimizer', | ||
triangulator=dict(type='JacobiTriangulator', ), | ||
active_rate=0.5, | ||
min_track_cnt=20, | ||
bone_capacity=30, | ||
w_bone3d=1.0, | ||
w_square_shape=1e-3, | ||
shape_max_iter=5, | ||
w_kps3d=1.0, | ||
w_regular_pose=1e-4, | ||
pose_max_iter=20, | ||
w_kps2d=1e-5, | ||
w_temporal_trans=1e-1 / pow(512 / 2048, 2), | ||
w_temporal_pose=1e-1 / pow(512 / 2048, 2), | ||
min_triangulate_cnt=15, | ||
init_active=0.9, | ||
triangulate_thresh=0.05, | ||
logger=logger, | ||
), | ||
graph_construct=dict( | ||
type='GraphConstruct', | ||
kps_convention=pred_kps3d_convention, | ||
max_epi_dist=0.15, | ||
max_temp_dist=0.3, | ||
normalize_edges=True, | ||
logger=logger, | ||
), | ||
graph_associate=dict( | ||
type='GraphAssociate', | ||
kps_convention=pred_kps3d_convention, | ||
w_epi=1, | ||
w_temp=2, | ||
w_view=1, | ||
w_paf=2, | ||
w_hier=1, | ||
c_view_cnt=1, | ||
min_check_cnt=10, | ||
logger=logger, | ||
), | ||
logger=logger, | ||
) | ||
|
||
dataset = dict( | ||
type='BottomUpMviewMpersonDataset', | ||
data_root=__data_root__, | ||
img_pipeline=[ | ||
dict(type='LoadImagePIL'), | ||
dict(type='ToTensor'), | ||
], | ||
meta_path=__meta_path__, | ||
test_mode=True, | ||
shuffled=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
gt_kps3d_convention='campus', | ||
cam_world2cam=True, | ||
) | ||
|
||
dataset_visualization = dict( | ||
type='MviewMpersonDataVisualization', | ||
data_root=__data_root__, | ||
output_dir=output_dir, | ||
meta_path=__meta_path__, | ||
pred_kps3d_paths=None, | ||
vis_percep2d=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
vis_gt_kps3d=False, | ||
vis_bottom_up=True, | ||
gt_kps3d_convention=None, | ||
resolution=(368, 368), | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
type = 'BottomUpAssociationEvaluation' | ||
|
||
__data_root__ = './xrmocap_data/FourDAG' | ||
__meta_path__ = __data_root__ + '/xrmocap_meta_seq4' | ||
|
||
logger = None | ||
output_dir = './output/fourdag/fourdag_fourdag_19_FourDAGOptimization/' | ||
pred_kps3d_convention = 'fourdag_19' | ||
eval_kps3d_convention = 'campus' | ||
selected_limbs_name = [ | ||
'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm', | ||
'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh' | ||
] | ||
# additional_limbs_names = [['jaw', 'headtop']] | ||
|
||
associator = dict( | ||
type='FourDAGAssociator', | ||
kps_convention=pred_kps3d_convention, | ||
min_asgn_cnt=5, | ||
use_tracking_edges=True, | ||
keypoints3d_optimizer=dict( | ||
type='FourDAGOptimizer', | ||
triangulator=dict(type='JacobiTriangulator', ), | ||
active_rate=0.5, | ||
min_track_cnt=20, | ||
bone_capacity=30, | ||
w_bone3d=1.0, | ||
w_square_shape=1e-3, | ||
shape_max_iter=5, | ||
w_kps3d=1.0, | ||
w_regular_pose=1e-4, | ||
pose_max_iter=20, | ||
w_kps2d=1e-5, | ||
w_temporal_trans=1e-1 / pow(512 / 2048, 2), | ||
w_temporal_pose=1e-1 / pow(512 / 2048, 2), | ||
min_triangulate_cnt=15, | ||
init_active=0.9, | ||
triangulate_thresh=0.05, | ||
logger=logger, | ||
), | ||
graph_construct=dict( | ||
type='GraphConstruct', | ||
kps_convention=pred_kps3d_convention, | ||
max_epi_dist=0.15, | ||
max_temp_dist=0.3, | ||
normalize_edges=True, | ||
logger=logger, | ||
), | ||
graph_associate=dict( | ||
type='GraphAssociate', | ||
kps_convention=pred_kps3d_convention, | ||
w_epi=1, | ||
w_temp=2, | ||
w_view=1, | ||
w_paf=2, | ||
w_hier=1, | ||
c_view_cnt=1, | ||
min_check_cnt=10, | ||
logger=logger, | ||
), | ||
logger=logger, | ||
) | ||
|
||
dataset = dict( | ||
type='BottomUpMviewMpersonDataset', | ||
data_root=__data_root__, | ||
img_pipeline=[ | ||
dict(type='LoadImagePIL'), | ||
dict(type='ToTensor'), | ||
], | ||
meta_path=__meta_path__, | ||
test_mode=True, | ||
shuffled=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
gt_kps3d_convention='campus', | ||
cam_world2cam=True, | ||
) | ||
|
||
dataset_visualization = dict( | ||
type='MviewMpersonDataVisualization', | ||
data_root=__data_root__, | ||
output_dir=output_dir, | ||
meta_path=__meta_path__, | ||
pred_kps3d_paths=None, | ||
vis_percep2d=False, | ||
kps2d_convention=pred_kps3d_convention, | ||
vis_gt_kps3d=False, | ||
vis_bottom_up=True, | ||
gt_kps3d_convention=None, | ||
resolution=(368, 368), | ||
) |
Oops, something went wrong.