The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”.
Based on the Oriented Reppoints detector with Swin Transformer backbone, the 3rd Place is achieved on the Task 1 and the 2nd Place is achieved on the Task 2 of 2021 challenge of Learning to Understand Aerial Images (LUAI) held on ICCV’2021. The detailed information is introduced in this paper of "LUAI Challenge 2021 on Learning to Understand Aerial Images, ICCVW2021".
- BackBone: add Swin-Transformer, ReResNet
- DataAug: add Mosaic4or9, Mixup, HSV, RandomPerspective, RandomScaleCrop
Please refer to for installation and dataset preparation.
This repo is based on . Please see for the basic usage.
The results on DOTA test-dev set are shown in the table below(password:aabb/swin/ABCD). More detailed results please see the paper.
Model | Backbone | MS Train/Test | DataAug | DOTAv1 mAP | DOTAv2 mAP | Download |
---|---|---|---|---|---|---|
OrientedReppoints | R-50 | - | - | 75.68 | - | baidu(aabb) |
OrientedReppoints | R-101 | - | √ | 76.21 | - | baidu(aabb) |
OrientedReppoints | R-101 | √ | √ | 78.12 | - | baidu(aabb) |
OrientedReppoints | SwinT-tiny | - | √ | - | 59.93 | baidu(aabb) |
ImageNet-1K and ImageNet-22K Pretrained Models
name | pretrain | resolution | acc@1 | acc@5 | #params | FLOPs | FPS | 22K model | 1K model | Need to turn read version |
---|---|---|---|---|---|---|---|---|---|---|
Swin-T | ImageNet-1K | 224x224 | 81.2 | 95.5 | 28M | 4.5G | 755 | - | github/baidu(swin)/config | ✔ |
Swin-S | ImageNet-1K | 224x224 | 83.2 | 96.2 | 50M | 8.7G | 437 | - | github/baidu(swin)/config | ✔ |
Swin-B | ImageNet-1K | 224x224 | 83.5 | 96.5 | 88M | 15.4G | 278 | - | github/baidu(swin)/config | ✔ |
Swin-B | ImageNet-1K | 384x384 | 84.5 | 97.0 | 88M | 47.1G | 85 | - | github/baidu(swin)/test-config | ✔ |
Swin-B | ImageNet-22K | 224x224 | 85.2 | 97.5 | 88M | 15.4G | 278 | github/baidu(swin) | github/baidu(swin)/test-config | ✔ |
Swin-B | ImageNet-22K | 384x384 | 86.4 | 98.0 | 88M | 47.1G | 85 | github/baidu(swin) | github/baidu(swin)/test-config | ✔ |
Swin-L | ImageNet-22K | 224x224 | 86.3 | 97.9 | 197M | 34.5G | 141 | github/baidu(swin) | github/baidu(swin)/test-config | ✔ |
Swin-L | ImageNet-22K | 384x384 | 87.3 | 98.2 | 197M | 103.9G | 42 | github/baidu(swin) | github/baidu(swin)/test-config | ✔ |
ReResNet50 | ImageNet-1K | 224x224 | 71.20 | 90.28 | - | - | - | - | google/baidu(ABCD)/log | - |
The mAOE results on DOTAv1 val set are shown in the table below(password:aabb).
Model | Backbone | mAOE | Download |
---|---|---|---|
OrientedReppoints | R-50 | 5.93° | baidu(aabb) |
Note:
- Wtihout the ground-truth of test subset, the mAOE of orientation evaluation is calculated on the val subset(original train subset for training).
- The orientation (angle) of an aerial object is define as below, the detail of mAOE, please see the paper. The code of mAOE is mAOE_evaluation.py.
The visual results of learning points and the oriented bounding boxes. The visualization code is .
- Learning points
- Oriented bounding box
DOTAv2遥感图像旋转目标检测竞赛经验分享(Swin Transformer + Anchor free/based方案)
@article{li2021oriented,
title="Oriented RepPoints for Aerial Object Detection.",
author="Wentong {Li}, Yijie {Chen}, Kaixuan {Hu}, Jianke {Zhu}.",
journal="arXiv preprint arXiv:2105.11111",
year="2021"
}
I have used utility functions from other wonderful open-source projects. Espeicially thank the authors of: