[Feature] Add 4d association method (#16)

* add 4dag method * rename fourd to bottom up, add metric and alignment, change name manner * rename some files and variables * fix a bug * delete Camera class in association * change configs * modify triangulation pipeline * convert data format form pkl to json * align triangulate method, align kps convention * delete fourdag triangulate, add fourdag19 convention pipeline * change some names * resolve conflict * resolve some comments * run pre-commit * align result * refactor code * refactor triangulator and optimization * resolve comments * pre-commit * change limb info * fix bug * add stringdoc * rename joint to kps * add yapf * pre-commit * add docstring * refactor * add limb info json file * fix bug * delete debug call * add readme * rephase term class * rephase associate * add cloud file and path * process fourdag seq5 * resolve comments
openxrlab · Oct 9, 2022 · e0c55d6 · e0c55d6
1 parent 1d02098
commit e0c55d6
Show file tree

Hide file tree

Showing 42 changed files with 4,101 additions and 12 deletions.
diff --git a/configs/fourdag/README.md b/configs/fourdag/README.md
@@ -0,0 +1,106 @@
+# 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
+Note: As a python variable name cannot start with a number, we refer to this method as `FourDAG` in the following text and code.
+
+  - [Introduction](#introduction)
+  - [Prepare limb information and datasets](#prepare-limb-information-and-datasets)
+  - [Results](#results)
+    - [Campus](#campus)
+    - [Shelf](#shelf)
+    - [FourDAG](#fourdag-1)
+
+## Introduction
+
+We provide the config files for FourDAG: [4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras](https://arxiv.org/abs/2002.12625).
+
+
+[Official Implementation](https://github.com/zhangyux15/4d_association)
+
+```BibTeX
+@inproceedings{Zhang20204DAG,
+  title={4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras},
+  author={Yuxiang Zhang and Liang An and Tao Yu and Xiu Li and Kun Li and Yebin Liu},
+  journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+  year={2020},
+  pages={1321-1330}
+}
+```
+## Prepare limb information and datasets
+
+- **Prepare limb information**:
+
+```
+sh scripts/download_weight.sh
+```
+You could find perception models in `weight` file.
+
+- **Prepare the datasets**:
+
+You could download Shelf, Campus or FourDAG datasets, and convert original dataset to our unified meta-data. Considering that it takes long to run a converter, we have done it for you. Please download compressed zip file for converted meta-data from [here](../../docs/en/dataset_preparation.md), and place meta-data under `ROOT/xrmocap_data/DATASET`.
+
+The final file structure would be like:
+
+```text
+xrmocap
+├── xrmocap
+├── docs
+├── tools
+├── configs
+├── weight
+|   └── limb_info.json
+└── xrmocap_data
+    ├── CampusSeq1
+    ├── Shelf
+    |   ├── Camera0
+    |   ├── ...
+    |   ├── Camera4
+    |   └── xrmocap_meta_testset
+    └── FourDAG
+        ├── seq2
+        ├── seq4
+        ├── seq5
+        ├── xrmocap_meta_seq2
+        ├── xrmocap_meta_seq4
+        └── xrmocap_meta_seq5
+```
+You can download just one dataset of Shelf, Campus and FourDAG.
+
+## Results
+
+We evaluate FourDAG on 3 benchmarks, report the Percentage of Correct Parts (PCP) on Shelf/Campus/FourDAG datasets.
+
+You can find the recommended configs in `configs/foudage/*/eval_keypoints3d.py`.
+
+
+### Campus
+
+The 2D keypoints and pafs data we use is generated by openpose, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data).
+
+| Config | Actor 0 | Actor 1 | Actor 2 | Average | Download |
+|:------:|:-------:|:--------:|:--------:|:--------:|:--------:|
+| [eval_keypoints3d.py](./campus_config/eval_keypoints3d.py) | 64.26 | 90.64 | 86.27 | 80.39 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/campus.zip) |
+
+
+### Shelf
+
+The 2D keypoints and pafs data we use is generated by fasterrcnn, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data).
+
+| Config | Actor 0 | Actor 1 | Actor 2 | Average | Download |
+|:------:|:-------:|:--------:|:--------:|:--------:|:--------:|
+| [eval_keypoints3d.py](./shelf_config/eval_keypoints3d.py) | 99.61 | 96.76 | 98.20 | 98.19 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/shelf.zip) |
+
+
+### FourDAG
+
+The 2D keypoints and pafs data we use is generated by mmpose, and you can download it from [here](/docs/en/dataset_preparation.md#download-converted-meta-data).
+
+- **seq2**
+
+| Config | Actor 0 | Actor 1 | Average | PCK@200mm | Download |
+|:-------:|:--------:|:--------:|:--------:|:--------:|:--------:|
+| [eval_keypoints3d.py](./fourdag_config/eval_keypoints3d_seq2.py) | 92.18 | 87.35 | 89.77 | 83.10 | [log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/fourdag.zip) |
+
+- **seq4**
+
+| Config | Actor 0 | Actor 1 | Actor 1 | Average | PCK@200mm | Download |
+|:-------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
+| [eval_keypoints3d.py](./fourdag_config/eval_keypoints3d_seq4.py) | 91.85 | 86.48 | 92.92 | 90.42 | 81.29 |[log](https://openxrlab-share.oss-cn-hongkong.aliyuncs.com/xrmocap/logs/FourDAG/fourdag.zip) |
diff --git a/configs/fourdag/campus_config/eval_keypoints3d.py b/configs/fourdag/campus_config/eval_keypoints3d.py
@@ -0,0 +1,90 @@
+type = 'BottomUpAssociationEvaluation'
+
+__data_root__ = './xrmocap_data/CampusSeq1'
+__meta_path__ = __data_root__ + '/xrmocap_meta_testset'
+
+logger = None
+output_dir = './output/fourdag/CampusSeq1_fourdag_19_FourDAGOptimization/'
+pred_kps3d_convention = 'fourdag_19'
+eval_kps3d_convention = 'campus'
+selected_limbs_name = [
+    'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm',
+    'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh'
+]
+additional_limbs_names = [['jaw', 'headtop']]
+
+associator = dict(
+    type='FourDAGAssociator',
+    kps_convention=pred_kps3d_convention,
+    min_asgn_cnt=10,
+    use_tracking_edges=True,
+    keypoints3d_optimizer=dict(
+        type='FourDAGOptimizer',
+        triangulator=dict(type='JacobiTriangulator', ),
+        active_rate=0.5,
+        min_track_cnt=20,
+        bone_capacity=30,
+        w_bone3d=1.0,
+        w_square_shape=1e-3,
+        shape_max_iter=5,
+        w_kps3d=1.0,
+        w_regular_pose=1e-4,
+        pose_max_iter=20,
+        w_kps2d=1e-5,
+        w_temporal_trans=1e-1 / pow(512 / 2048, 2),
+        w_temporal_pose=1e-1 / pow(512 / 2048, 2),
+        min_triangulate_cnt=15,
+        init_active=0.9,
+        triangulate_thresh=0.05,
+        logger=logger,
+    ),
+    graph_construct=dict(
+        type='GraphConstruct',
+        kps_convention=pred_kps3d_convention,
+        max_epi_dist=0.15,
+        max_temp_dist=0.2,
+        normalize_edges=True,
+        logger=logger,
+    ),
+    graph_associate=dict(
+        type='GraphAssociate',
+        kps_convention=pred_kps3d_convention,
+        w_epi=2,
+        w_temp=2,
+        w_view=2,
+        w_paf=4,
+        w_hier=0.5,
+        c_view_cnt=1.5,
+        min_check_cnt=1,
+        logger=logger,
+    ),
+    logger=logger,
+)
+
+dataset = dict(
+    type='BottomUpMviewMpersonDataset',
+    data_root=__data_root__,
+    img_pipeline=[
+        dict(type='LoadImagePIL'),
+        dict(type='ToTensor'),
+    ],
+    meta_path=__meta_path__,
+    test_mode=True,
+    shuffled=False,
+    kps2d_convention=pred_kps3d_convention,
+    gt_kps3d_convention='campus',
+    cam_world2cam=True,
+)
+
+dataset_visualization = dict(
+    type='MviewMpersonDataVisualization',
+    data_root=__data_root__,
+    output_dir=output_dir,
+    meta_path=__meta_path__,
+    pred_kps3d_paths=None,
+    vis_percep2d=False,
+    kps2d_convention=pred_kps3d_convention,
+    vis_gt_kps3d=False,
+    vis_bottom_up=True,
+    gt_kps3d_convention=None,
+)
diff --git a/configs/fourdag/fourdag_config/eval_keypoints3d_seq2.py b/configs/fourdag/fourdag_config/eval_keypoints3d_seq2.py
@@ -0,0 +1,91 @@
+type = 'BottomUpAssociationEvaluation'
+
+__data_root__ = './xrmocap_data/FourDAG/'
+__meta_path__ = __data_root__ + '/xrmocap_meta_seq2'
+
+logger = None
+output_dir = './output/fourdag/fourdag_fourdag_19_FourDAGOptimization/'
+pred_kps3d_convention = 'fourdag_19'
+eval_kps3d_convention = 'campus'
+selected_limbs_name = [
+    'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm',
+    'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh'
+]
+# additional_limbs_names = [['jaw', 'headtop']]
+
+associator = dict(
+    type='FourDAGAssociator',
+    kps_convention=pred_kps3d_convention,
+    min_asgn_cnt=5,
+    use_tracking_edges=True,
+    keypoints3d_optimizer=dict(
+        type='FourDAGOptimizer',
+        triangulator=dict(type='JacobiTriangulator', ),
+        active_rate=0.5,
+        min_track_cnt=20,
+        bone_capacity=30,
+        w_bone3d=1.0,
+        w_square_shape=1e-3,
+        shape_max_iter=5,
+        w_kps3d=1.0,
+        w_regular_pose=1e-4,
+        pose_max_iter=20,
+        w_kps2d=1e-5,
+        w_temporal_trans=1e-1 / pow(512 / 2048, 2),
+        w_temporal_pose=1e-1 / pow(512 / 2048, 2),
+        min_triangulate_cnt=15,
+        init_active=0.9,
+        triangulate_thresh=0.05,
+        logger=logger,
+    ),
+    graph_construct=dict(
+        type='GraphConstruct',
+        kps_convention=pred_kps3d_convention,
+        max_epi_dist=0.15,
+        max_temp_dist=0.3,
+        normalize_edges=True,
+        logger=logger,
+    ),
+    graph_associate=dict(
+        type='GraphAssociate',
+        kps_convention=pred_kps3d_convention,
+        w_epi=1,
+        w_temp=2,
+        w_view=1,
+        w_paf=2,
+        w_hier=1,
+        c_view_cnt=1,
+        min_check_cnt=10,
+        logger=logger,
+    ),
+    logger=logger,
+)
+
+dataset = dict(
+    type='BottomUpMviewMpersonDataset',
+    data_root=__data_root__,
+    img_pipeline=[
+        dict(type='LoadImagePIL'),
+        dict(type='ToTensor'),
+    ],
+    meta_path=__meta_path__,
+    test_mode=True,
+    shuffled=False,
+    kps2d_convention=pred_kps3d_convention,
+    gt_kps3d_convention='campus',
+    cam_world2cam=True,
+)
+
+dataset_visualization = dict(
+    type='MviewMpersonDataVisualization',
+    data_root=__data_root__,
+    output_dir=output_dir,
+    meta_path=__meta_path__,
+    pred_kps3d_paths=None,
+    vis_percep2d=False,
+    kps2d_convention=pred_kps3d_convention,
+    vis_gt_kps3d=False,
+    vis_bottom_up=True,
+    gt_kps3d_convention=None,
+    resolution=(368, 368),
+)
diff --git a/configs/fourdag/fourdag_config/eval_keypoints3d_seq4.py b/configs/fourdag/fourdag_config/eval_keypoints3d_seq4.py
@@ -0,0 +1,91 @@
+type = 'BottomUpAssociationEvaluation'
+
+__data_root__ = './xrmocap_data/FourDAG'
+__meta_path__ = __data_root__ + '/xrmocap_meta_seq4'
+
+logger = None
+output_dir = './output/fourdag/fourdag_fourdag_19_FourDAGOptimization/'
+pred_kps3d_convention = 'fourdag_19'
+eval_kps3d_convention = 'campus'
+selected_limbs_name = [
+    'left_lower_leg', 'right_lower_leg', 'left_upperarm', 'right_upperarm',
+    'left_forearm', 'right_forearm', 'left_thigh', 'right_thigh'
+]
+# additional_limbs_names = [['jaw', 'headtop']]
+
+associator = dict(
+    type='FourDAGAssociator',
+    kps_convention=pred_kps3d_convention,
+    min_asgn_cnt=5,
+    use_tracking_edges=True,
+    keypoints3d_optimizer=dict(
+        type='FourDAGOptimizer',
+        triangulator=dict(type='JacobiTriangulator', ),
+        active_rate=0.5,
+        min_track_cnt=20,
+        bone_capacity=30,
+        w_bone3d=1.0,
+        w_square_shape=1e-3,
+        shape_max_iter=5,
+        w_kps3d=1.0,
+        w_regular_pose=1e-4,
+        pose_max_iter=20,
+        w_kps2d=1e-5,
+        w_temporal_trans=1e-1 / pow(512 / 2048, 2),
+        w_temporal_pose=1e-1 / pow(512 / 2048, 2),
+        min_triangulate_cnt=15,
+        init_active=0.9,
+        triangulate_thresh=0.05,
+        logger=logger,
+    ),
+    graph_construct=dict(
+        type='GraphConstruct',
+        kps_convention=pred_kps3d_convention,
+        max_epi_dist=0.15,
+        max_temp_dist=0.3,
+        normalize_edges=True,
+        logger=logger,
+    ),
+    graph_associate=dict(
+        type='GraphAssociate',
+        kps_convention=pred_kps3d_convention,
+        w_epi=1,
+        w_temp=2,
+        w_view=1,
+        w_paf=2,
+        w_hier=1,
+        c_view_cnt=1,
+        min_check_cnt=10,
+        logger=logger,
+    ),
+    logger=logger,
+)
+
+dataset = dict(
+    type='BottomUpMviewMpersonDataset',
+    data_root=__data_root__,
+    img_pipeline=[
+        dict(type='LoadImagePIL'),
+        dict(type='ToTensor'),
+    ],
+    meta_path=__meta_path__,
+    test_mode=True,
+    shuffled=False,
+    kps2d_convention=pred_kps3d_convention,
+    gt_kps3d_convention='campus',
+    cam_world2cam=True,
+)
+
+dataset_visualization = dict(
+    type='MviewMpersonDataVisualization',
+    data_root=__data_root__,
+    output_dir=output_dir,
+    meta_path=__meta_path__,
+    pred_kps3d_paths=None,
+    vis_percep2d=False,
+    kps2d_convention=pred_kps3d_convention,
+    vis_gt_kps3d=False,
+    vis_bottom_up=True,
+    gt_kps3d_convention=None,
+    resolution=(368, 368),
+)