Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
ZEROICEWANG committed Jun 19, 2024
0 parents commit 5339319
Show file tree
Hide file tree
Showing 39 changed files with 5,649 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
./models
36 changes: 36 additions & 0 deletions README.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# MEPNet

#### Description
{**When you're done, you can delete the content in this README and update the file with details for others getting started with your repository**}

#### Software Architecture
Software architecture description

#### Installation

1. xxxx
2. xxxx
3. xxxx

#### Instructions

1. xxxx
2. xxxx
3. xxxx

#### Contribution

1. Fork the repository
2. Create Feat_xxx branch
3. Commit your code
4. Create Pull Request


#### Gitee Feature

1. You can use Readme\_XXX.md to support different languages, such as Readme\_en.md, Readme\_zh.md
2. Gitee blog [blog.gitee.com](https://blog.gitee.com)
3. Explore open source project [https://gitee.com/explore](https://gitee.com/explore)
4. The most valuable open source project [GVP](https://gitee.com/gvp)
5. The manual of Gitee [https://gitee.com/help](https://gitee.com/help)
6. The most popular members [https://gitee.com/gitee-stars/](https://gitee.com/gitee-stars/)
64 changes: 64 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# [MEPNet: Mask Prior-distribution Interaction and Edge Probability Estimation for Salient Object Detection]

by XX

## Introduction
Salient Object Detection (SOD) has been researched extensively and achieved impressive performance. However, current methods still present defective results while facing complex scenes. Because the deceptive background hinders the methods from distinguishing the salient subject and presenting discriminative edges. To address this issue, we propose a method named MEPNet to realize two tasks: interacting the mask prior-distribution with multi-scale feature to suppress the background disturbance; and estimating the edge probability of salient objects to boost the detail of decoding results. The proposed method MEPNet is mainly composed of four kinds of modules, including Lite Receptive Filed Block (RFB-Lite) module, Prior Query (PQ) module, Full-scale sub-Decoder (FD) module, and Edge Auxiliary (EA) module. The RFB-Lite adopts the multi-scale convolution with grouped branches to efficiently reduce channel redundancy and enhance the diversity of semantics. The PQ introduces the mask prior-distribution into the fused feature by multi-head cross-attention. To solve the conflict between the number of attention heads and the computational complexity in cross-attention, Multi-head L-Cross Attention (MLAC) is proposed to self-weight the feature while calculating the attention score matrix and global attention. The FD realizes the bi-direction decoding with feature pyramid network (PFN) and reversed feature pyramid network (FPN-R). Three FDs adopted in MEPNet present a multi-basis decoding to fully utilize multi-scale features. The alternating use of PQ and FD ensures the suppression of background disturbance. The EA, as the last part of MEPNet, boosts the decoding results with edge filter estimating the edge probability and edge refine correcting the up-sampled decoding results. The SOD experimental results on DUTS-TE, HKU-IS, PASCAL-S, ECSSD, and DUT-OMRON datasets demonstrate that the proposed MEPNet is more robust under different complex scenes when compared to some state-of-the-art (SOTA) methods.


## Prerequisites
- [Python 3.6](https://www.python.org/)
- [Pytorch 1.10](http://pytorch.org/)
- [OpenCV 4.5.5.64](https://opencv.org/)
- [Numpy 1.19.5](https://numpy.org/)
- [pillow 8.4.0](https://pypi.org/project/Pillow/)
- [timm 0.4.12](https://pypi.org/project/timm/0.4.12/)
- [tqdm 4.64.0](https://pypi.org/project/tqdm/4.64.0/)


## Clone repository

```shell
git clone https://github.com/ZEROICEWANG/MEPNet.git
cd MEPNet/
```

## Download dataset

Download the following datasets and unzip them into `../SOD_Data` folder

- [PASCAL-S](http://cbi.gatech.edu/salobj/)
- [ECSSD](http://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/dataset.html)
- [HKU-IS](https://i.cs.hku.hk/~gbli/deep_saliency.html)
- [DUT-OMRON](http://saliencydetection.net/dut-omron/)
- [DUTS](http://saliencydetection.net/duts/)


## Download model

- If you want to test the performance of MEPNet, please download the model([Baidu](https://pan.baidu.com/s/1S_uwKUEUIoRMw-p9Ek28zA?pwd=6jyz) [Google](https://drive.google.com/file/d/1-2gtqk9M3Ex9Ou_YtOSFsQmvPWkcOySg/view?usp=sharing)) into `models/RES_Model` folder, and download the pretrained model ([Baidu](https://pan.baidu.com/s/1Lh-MrKSLU1rG6DL45PiqtQ?pwd=pfvi) [Google](https://drive.google.com/file/d/1Y8d2cSZh71oKd4qQ56TTK_sYvhTIHe8G/view?usp=sharing)) into `models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain` folder.


## Training

```shell
python3 train.py # or using bash cmd_train.sh
```


## Testing

```shell
python3 predict.py # or using bash cmd.sh
```
- After testing, saliency maps of `PASCAL-S`, `ECSSD`, `HKU-IS`, `DUT-OMRON`, `DUTS-TE` will be saved in `predict_result/` folder.

## Saliency maps & Pre-Trained model & Trained model
- saliency maps: [Baidu](https://pan.baidu.com/s/1EEqGAK5KU-Frpsvx9GKFDw?pwd=by1m) [Google](https://drive.google.com/file/d/16WoEBpne1mpsa_NEQ1ffFG9v7ZnppqxR/view?usp=sharing)

- pretrained model: [Baidu](https://pan.baidu.com/s/1Lh-MrKSLU1rG6DL45PiqtQ?pwd=pfvi) [Google](https://drive.google.com/file/d/1Y8d2cSZh71oKd4qQ56TTK_sYvhTIHe8G/view?usp=sharing)

- trained model: [Baidu](https://pan.baidu.com/s/1S_uwKUEUIoRMw-p9Ek28zA?pwd=6jyz) [Google](https://drive.google.com/file/d/1-2gtqk9M3Ex9Ou_YtOSFsQmvPWkcOySg/view?usp=sharing)



5 changes: 5 additions & 0 deletions cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

python sleep.py --base 0 --subbase 7.8 --iter 0
dir=($(ls -A ./models/RES_Model/))

python predict.py --name ${dir[-1]} --model CPD_RES_PA --config-file 'config/standard.yaml'
6 changes: 6 additions & 0 deletions cmd_train.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# echo 1 > /proc/sys/vm/drop_caches
# echo 2 > /proc/sys/vm/drop_caches
# echo 3 > /proc/sys/vm/drop_caches

python sleep.py --base 0 --subbase 0 --iter 0
python -m torch.distributed.launch --nproc_per_node 2 train.py --config-file ./config/standard.yaml --gpus '0,1'
1 change: 1 addition & 0 deletions config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .params import _C as cfg
122 changes: 122 additions & 0 deletions config/params.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import os
from yacs.config import CfgNode as CN

_C = CN()
_C.gpus='0,1'
_C.local_rank=-1
_C.config_file=''
_C.global_rank=-1
_C.world_size=1
_C.sync_bn=True
_C.seed=1000
_C.print_rate=100
_C.empty_cache=False
_C.BalancedData=False
_C.model=CN()
_C.model.mid_channel=256
_C.model.expand=[1.0,1.0,1.0]
_C.model.using_split=False
_C.model.neck=CN()
_C.model.neck.type='RFB_Conv'
_C.model.neck.using_PA=False
_C.model.neck.using_CA=False
_C.model.Edge_Ass=CN()
_C.model.Edge_Ass.type='EdgePro_Auxiliary'
_C.model.Edge_Ass.using_probability=True
_C.model.Edge_Ass.using_canny=False
_C.model.Edge_Ass.using=False
_C.model.Edge_Ass.using_PV=False
_C.model.Edge_Ass.using_edge_SC=False
_C.model.Edge_Ass.inchannel=4 if _C.model.Edge_Ass.using_PV else 3
_C.model.Edge_Ass.PV_reduction='max'
_C.model.Edge_Ass.PV_patch_size=3
_C.model.Edge_Ass.PV_normal=False
_C.model.Edge_Ass.mid_channel=4
_C.model.Edge_Ass.gain=2.0


_C.model.PQ=CN()
_C.model.PQ.type='patch_query2_swin_channel_last.prior_query'
_C.model.PQ.using=False
_C.model.PQ.patch_size=2
_C.model.PQ.map_size=16
_C.model.PQ.num_head=4
_C.model.PQ.using_pretrain=False
_C.model.PQ.position=[True,False,False]
_C.model.PQ.pretrain_path='./models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain/model_199.pth'
_C.model.PQ.using_LA=True
_C.model.PQ.using_prior=True

_C.model.SR=CN() #salient object reconstruction
_C.model.SR.using=False
_C.model.header='my_aggregation_lite_CSP'

_C.model.EQ=CN()
_C.model.EQ.using=False
_C.model.EQ.map_size=88
_C.model.EQ.win_size=4
_C.model.EQ.num_heads=4

_C.model.AF=CN()
_C.model.AF.using=False
_C.model.AF.position=[True,False,False]


_C.dataloader=CN()
_C.dataloader.batch_size=16
_C.dataloader.num_work=8
_C.dataloader.train_size=352
_C.dataloader.test_size=352
_C.dataloader.using_random_size=True
_C.dataloader.train_path=['../SOD_Data/DUTS-TR/DUTS-TR-Image/', '../SOD_Data/DUTS-TR/DUTS-TR-Mask/']
_C.dataloader.test_path=['../SOD_Data/DUTS-TE/DUTS-TE-Image/', '../SOD_Data/DUTS-TE/DUTS-TE-Mask/']


_C.solver=CN()
_C.solver.type='AdamW'
_C.solver.epoch=70
_C.solver.momen=0.9
_C.solver.weight_decay=1e-5
_C.solver.lr=1e-4
_C.solver.base_batchsize=12
_C.solver.min_lr=_C.solver.lr*0.001
_C.solver.warmup_lr=_C.solver.lr*0.0001
_C.solver.init_epoch=10
_C.solver.warmup_epoch = 3
_C.solver.t_mul=2
_C.solver.using_clip=True
_C.solver.clip=0.5
_C.solver.lr_rate=[0.1,1,1]
_C.solver.decay_rate=0.5
_C.solver.lr_step=[10,30,70]


_C.loss=CN()
_C.loss.combination=['DICE','SSIM', 'WBCE']
_C.loss.combination_p=[[], [], []]
_C.loss.loss_weight=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
_C.loss.loss_scale=CN()
_C.loss.loss_scale.number=4
_C.loss.loss_scale.combination=[[1],[1,2],[1,2,3],[1,2,3]]
_C.loss.using_dice=True
_C.loss.edge=CN()
_C.loss.edge.using=_C.model.Edge_Ass.using
_C.loss.edge.gamma=1.5
_C.loss.edge.gains=[1,8,64]
_C.loss.edge.weights=[1,1/4,1/16]
_C.loss.edge.stage=[10,20,40]
_C.loss.edge.max_rate=0.9
_C.loss.edge.base_size=288
_C.loss.edge.loss_type='mse'
_C.loss.using_Deepest_loss=False
_C.loss.using_normal=True
_C.loss.using_filter_interpolate=False
_C.loss.mask_binary=False
_C.loss.AFloss=CN()
_C.loss.AFloss.weight=0.1
_C.loss.AFloss.Four=False
_C.loss.using_Rweight=True
_C.loss.Rweight=0.5



57 changes: 57 additions & 0 deletions config/standard.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
seed: 5555
model:
mid_channel: 196
expand: [2.25, 1.75, 1.25]
neck:
type: RFB_Lite
Edge_Ass:
using: True
mid_channel: 4
using_PV: False
inchannel: 3
using_edge_SC: True
PV_reduction: None
PV_normal: True
gain: 1.0
header: my_aggregation_lite_CSP
PQ:
using: True
patch_size: 4
map_size: 16
num_head: 8
type: prior_query2_channel_last_L_attention_rm_SA.prior_query
using_pretrain: True
pretrain_path: models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain/model_199.pth
position: [True,True,True]



loss:
combination: ['DICE','SSIM', 'FocalLoss','MSELoss','WBCE','Contrast_Loss','Contrast_Loss','Contrast_Loss','Contrast_Loss']
using_Deepest_loss: True
combination_p: [[],[],[2],[],[],[0.95],[0.8],[0.65],[0.5]]
loss_weight: [0.0625, 0.125,0.125, 0.25,0.25,0.25, 0.5,0.5,0.5, 0.5,0] #sum(combination)+edge_loss
loss_scale:
number: 4 #11,22,44
combination: [[1],[1,2],[1,2,3],[1,2,3]]
edge:
using: True
gamma: 1.5
gains: [1,48,64]
weights: [1, 0.25, 0.0625] #1,1/4,1/16
stage: [10,20,40]
max_rate: 0.9
base_size: 288
loss_type: bce
using_normal: True


dataloader:
batch_size: 12

solver:
init_epoch: 10
epoch: 70
lr_rate: [0.1,1.5,1]
decay_rate: 1.0
lr_step: [10,30,70]
Loading

0 comments on commit 5339319

Please sign in to comment.