Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] multiple bugs related to FilterAnnotations, KeypointConverter and PoseLocalVisualizer #3182

Open
2 tasks done
zgjja opened this issue Jan 24, 2025 · 0 comments
Open
2 tasks done

Comments

@zgjja
Copy link
Contributor

zgjja commented Jan 24, 2025

Prerequisite

Environment

OrderedDict([('sys.platform', 'linux'), ('Python', '3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]'), ('CUDA available', True), ('MUSA available', False), ('numpy_random_seed', 2147483648), ('GPU 0', 'NVIDIA GeForce RTX 4080'), ('CUDA_HOME', '/usr/local/cuda'), ('NVCC', 'Cuda compilation tools, release 12.3, V12.3.107'), ('GCC', 'x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0'), ('PyTorch', '2.2.0a0+81ea7a4'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 11.2\n - C++ Version: 201703\n - Intel(R) oneAPI Math Kernel Library Version 2021.1-Product Build 20201104 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v3.1.1 (Git Hash N/A)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: AVX2\n - CUDA Runtime 12.3\n - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_72,code=sm_72;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_87,code=sm_87;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_90,code=compute_90\n - CuDNN 8.9.7 (built against CUDA 12.2)\n - Magma 2.6.2\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.3, CUDNN_VERSION=8.9.7, CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.17.0a0'), ('OpenCV', '4.10.0'), ('MMEngine', '0.10.5'), ('MMPose', '1.3.2+5408bc7')])

Reproduces the problem - code sample

see Additional information

Reproduces the problem - command or script

see Additional information

Reproduces the problem - error message

see Additional information

Additional information

Context

last year in #2963, I have fixed this issue when using FilterAnnotations, but this was reverted by @xiexinch in #3037, which causes this bug again.

Cause

Bug 1: Data type mismatch

FilterAnnotations is able to filter the gt instances in a image sample, and it may change the following keys:

        keys = ('bbox', 'bbox_score', 'category_id', 'keypoints',
                'keypoints_visible', 'area')

Under the hood, it use the & op in numpy to integrate all conditions, and generates a boolean index array, so as to index the keys inside the instance samples. So all the values of these keys should be a np.ndarray

Bug 2: keypoints_visible may have 2 meanings

When using PoseLocalVisualizer with KeypointConverter, the visualizer can visualize the converted keypoints, but the KeypointConverter add a new column in keypoints_visible to represent keypoints_visible_weights, and in PoseLocalVisualizer here:

                                or pos2[1] >= img_h or visible[sk[0]] < kpt_thr
                                or visible[sk[1]] < kpt_thr

will crash because visible[sk[0]] is 2d after modified by KeypointConverter:

        results['keypoints_visible'] = np.stack(
            [keypoints_visible, keypoints_visible_weights], axis=2)

Bug3: keypoints_3d logic error

in the new commit, some datasets, like coco-wholebody, added a new key in data_info: keypoints_3d, and its default value is None, will also cause a bug in KeypointConverter:

        key = 'keypoints_3d' if 'keypoints_3d' in results else 'keypoints'

if we are dealing with 2d case, the results will also has 'keypoints_3d' key: [None] (if num_instance=1 here), this need a fix.

Possible fix

Bug1

Fix all datasets' invalid keys

I think this is sub-optimal

Make implicit conversion in FilterAnnotations

Easy to fix, but i don't know if any side affect will be caused for downstream codes

Bug2

Changing the visualizer itself is easy but i dont think it's a good idea...

the keypoints_visible is ambiguous for both keypoints_visible and keypoints_visible_weights, we should use new key keypoints_visible_weights to represent, i can also find the author want to do this in PackPoseInputs. But also, i don't know if any other code use this wired logic that may break it after fixing it. And besides, if this new key is added, we also need to change it in FilterAnnotations accordingly.

Changing this logic will make mmpose easier to maintain, because there are a lot of logic work like:

        keypoints_visible = results['keypoints_visible']
        if keypoints_visible.ndim == 3 and keypoints_visible.shape[2] == 2:
            keypoints_visible, keypoints_visible_weights = \
                keypoints_visible[..., 0], keypoints_visible[..., 1]
            results['keypoints_visible'] = keypoints_visible
            results['keypoints_visible_weights'] = keypoints_visible_weights

while i think the keypoints_visible.ndim should always be 2.
Since i am totally unaware of which algorithms use this "keypoints_visible_weights", i need your help or advice

Bug3

change the if logic

@zgjja zgjja changed the title [Bug] multiple bugs related to FilterAnnotations and PoseLocalVisualizer [Bug] multiple bugs related to FilterAnnotations, KeypointConverter and PoseLocalVisualizer Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant