[Bug] multiple bugs related to `FilterAnnotations`, `KeypointConverter` and `PoseLocalVisualizer` #3182

zgjja · 2025-01-24T06:21:11Z

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmpose).

Environment

OrderedDict([('sys.platform', 'linux'), ('Python', '3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]'), ('CUDA available', True), ('MUSA available', False), ('numpy_random_seed', 2147483648), ('GPU 0', 'NVIDIA GeForce RTX 4080'), ('CUDA_HOME', '/usr/local/cuda'), ('NVCC', 'Cuda compilation tools, release 12.3, V12.3.107'), ('GCC', 'x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0'), ('PyTorch', '2.2.0a0+81ea7a4'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 11.2\n - C++ Version: 201703\n - Intel(R) oneAPI Math Kernel Library Version 2021.1-Product Build 20201104 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v3.1.1 (Git Hash N/A)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: AVX2\n - CUDA Runtime 12.3\n - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_72,code=sm_72;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_87,code=sm_87;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_90,code=compute_90\n - CuDNN 8.9.7 (built against CUDA 12.2)\n - Magma 2.6.2\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.3, CUDNN_VERSION=8.9.7, CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.17.0a0'), ('OpenCV', '4.10.0'), ('MMEngine', '0.10.5'), ('MMPose', '1.3.2+5408bc7')])

Reproduces the problem - code sample

see Additional information

Reproduces the problem - command or script

see Additional information

Reproduces the problem - error message

see Additional information

Additional information

Context

last year in #2963, I have fixed this issue when using FilterAnnotations, but this was reverted by @xiexinch in #3037, which causes this bug again.

Cause

Bug 1: Data type mismatch

FilterAnnotations is able to filter the gt instances in a image sample, and it may change the following keys:

        keys = ('bbox', 'bbox_score', 'category_id', 'keypoints',
                'keypoints_visible', 'area')

Under the hood, it use the & op in numpy to integrate all conditions, and generates a boolean index array, so as to index the keys inside the instance samples. So all the values of these keys should be a np.ndarray

Bug 2: `keypoints_visible` may have 2 meanings

When using PoseLocalVisualizer with KeypointConverter, the visualizer can visualize the converted keypoints, but the KeypointConverter add a new column in keypoints_visible to represent keypoints_visible_weights, and in PoseLocalVisualizer here:

                                or pos2[1] >= img_h or visible[sk[0]] < kpt_thr
                                or visible[sk[1]] < kpt_thr

will crash because visible[sk[0]] is 2d after modified by KeypointConverter:

        results['keypoints_visible'] = np.stack(
            [keypoints_visible, keypoints_visible_weights], axis=2)

Bug3: `keypoints_3d` logic error

in the new commit, some datasets, like coco-wholebody, added a new key in data_info: keypoints_3d, and its default value is None, will also cause a bug in KeypointConverter:

        key = 'keypoints_3d' if 'keypoints_3d' in results else 'keypoints'

if we are dealing with 2d case, the results will also has 'keypoints_3d' key: [None] (if num_instance=1 here), this need a fix.

Possible fix

Bug1

Fix all datasets' invalid keys

I think this is sub-optimal

Make implicit conversion in `FilterAnnotations`

Easy to fix, but i don't know if any side affect will be caused for downstream codes

Bug2

Changing the visualizer itself is easy but i dont think it's a good idea...

the keypoints_visible is ambiguous for both keypoints_visible and keypoints_visible_weights, we should use new key keypoints_visible_weights to represent, i can also find the author want to do this in PackPoseInputs. But also, i don't know if any other code use this wired logic that may break it after fixing it. And besides, if this new key is added, we also need to change it in FilterAnnotations accordingly.

Changing this logic will make mmpose easier to maintain, because there are a lot of logic work like:

        keypoints_visible = results['keypoints_visible']
        if keypoints_visible.ndim == 3 and keypoints_visible.shape[2] == 2:
            keypoints_visible, keypoints_visible_weights = \
                keypoints_visible[..., 0], keypoints_visible[..., 1]
            results['keypoints_visible'] = keypoints_visible
            results['keypoints_visible_weights'] = keypoints_visible_weights

while i think the keypoints_visible.ndim should always be 2.
Since i am totally unaware of which algorithms use this "keypoints_visible_weights", i need your help or advice

Bug3

change the if logic

The text was updated successfully, but these errors were encountered:

zgjja changed the title ~~[Bug] multiple bugs related to FilterAnnotations and PoseLocalVisualizer~~ [Bug] multiple bugs related to FilterAnnotations, KeypointConverter and PoseLocalVisualizer Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] multiple bugs related to `FilterAnnotations`, `KeypointConverter` and `PoseLocalVisualizer` #3182

[Bug] multiple bugs related to `FilterAnnotations`, `KeypointConverter` and `PoseLocalVisualizer` #3182

zgjja commented Jan 24, 2025 •

edited

Loading

[Bug] multiple bugs related to FilterAnnotations, KeypointConverter and PoseLocalVisualizer #3182

[Bug] multiple bugs related to FilterAnnotations, KeypointConverter and PoseLocalVisualizer #3182

Comments

zgjja commented Jan 24, 2025 • edited Loading

Prerequisite

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Context

Cause

Bug 1: Data type mismatch

Bug 2: keypoints_visible may have 2 meanings

Bug3: keypoints_3d logic error

Possible fix

Bug1

Fix all datasets' invalid keys

Make implicit conversion in FilterAnnotations

Bug2

Bug3

[Bug] multiple bugs related to `FilterAnnotations`, `KeypointConverter` and `PoseLocalVisualizer` #3182

[Bug] multiple bugs related to `FilterAnnotations`, `KeypointConverter` and `PoseLocalVisualizer` #3182

zgjja commented Jan 24, 2025 •

edited

Loading

Bug 2: `keypoints_visible` may have 2 meanings

Bug3: `keypoints_3d` logic error

Make implicit conversion in `FilterAnnotations`