-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 5339319
Showing
39 changed files
with
5,649 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
./models |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# MEPNet | ||
|
||
#### Description | ||
{**When you're done, you can delete the content in this README and update the file with details for others getting started with your repository**} | ||
|
||
#### Software Architecture | ||
Software architecture description | ||
|
||
#### Installation | ||
|
||
1. xxxx | ||
2. xxxx | ||
3. xxxx | ||
|
||
#### Instructions | ||
|
||
1. xxxx | ||
2. xxxx | ||
3. xxxx | ||
|
||
#### Contribution | ||
|
||
1. Fork the repository | ||
2. Create Feat_xxx branch | ||
3. Commit your code | ||
4. Create Pull Request | ||
|
||
|
||
#### Gitee Feature | ||
|
||
1. You can use Readme\_XXX.md to support different languages, such as Readme\_en.md, Readme\_zh.md | ||
2. Gitee blog [blog.gitee.com](https://blog.gitee.com) | ||
3. Explore open source project [https://gitee.com/explore](https://gitee.com/explore) | ||
4. The most valuable open source project [GVP](https://gitee.com/gvp) | ||
5. The manual of Gitee [https://gitee.com/help](https://gitee.com/help) | ||
6. The most popular members [https://gitee.com/gitee-stars/](https://gitee.com/gitee-stars/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# [MEPNet: Mask Prior-distribution Interaction and Edge Probability Estimation for Salient Object Detection] | ||
|
||
by XX | ||
|
||
## Introduction | ||
Salient Object Detection (SOD) has been researched extensively and achieved impressive performance. However, current methods still present defective results while facing complex scenes. Because the deceptive background hinders the methods from distinguishing the salient subject and presenting discriminative edges. To address this issue, we propose a method named MEPNet to realize two tasks: interacting the mask prior-distribution with multi-scale feature to suppress the background disturbance; and estimating the edge probability of salient objects to boost the detail of decoding results. The proposed method MEPNet is mainly composed of four kinds of modules, including Lite Receptive Filed Block (RFB-Lite) module, Prior Query (PQ) module, Full-scale sub-Decoder (FD) module, and Edge Auxiliary (EA) module. The RFB-Lite adopts the multi-scale convolution with grouped branches to efficiently reduce channel redundancy and enhance the diversity of semantics. The PQ introduces the mask prior-distribution into the fused feature by multi-head cross-attention. To solve the conflict between the number of attention heads and the computational complexity in cross-attention, Multi-head L-Cross Attention (MLAC) is proposed to self-weight the feature while calculating the attention score matrix and global attention. The FD realizes the bi-direction decoding with feature pyramid network (PFN) and reversed feature pyramid network (FPN-R). Three FDs adopted in MEPNet present a multi-basis decoding to fully utilize multi-scale features. The alternating use of PQ and FD ensures the suppression of background disturbance. The EA, as the last part of MEPNet, boosts the decoding results with edge filter estimating the edge probability and edge refine correcting the up-sampled decoding results. The SOD experimental results on DUTS-TE, HKU-IS, PASCAL-S, ECSSD, and DUT-OMRON datasets demonstrate that the proposed MEPNet is more robust under different complex scenes when compared to some state-of-the-art (SOTA) methods. | ||
|
||
|
||
## Prerequisites | ||
- [Python 3.6](https://www.python.org/) | ||
- [Pytorch 1.10](http://pytorch.org/) | ||
- [OpenCV 4.5.5.64](https://opencv.org/) | ||
- [Numpy 1.19.5](https://numpy.org/) | ||
- [pillow 8.4.0](https://pypi.org/project/Pillow/) | ||
- [timm 0.4.12](https://pypi.org/project/timm/0.4.12/) | ||
- [tqdm 4.64.0](https://pypi.org/project/tqdm/4.64.0/) | ||
|
||
|
||
## Clone repository | ||
|
||
```shell | ||
git clone https://github.com/ZEROICEWANG/MEPNet.git | ||
cd MEPNet/ | ||
``` | ||
|
||
## Download dataset | ||
|
||
Download the following datasets and unzip them into `../SOD_Data` folder | ||
|
||
- [PASCAL-S](http://cbi.gatech.edu/salobj/) | ||
- [ECSSD](http://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/dataset.html) | ||
- [HKU-IS](https://i.cs.hku.hk/~gbli/deep_saliency.html) | ||
- [DUT-OMRON](http://saliencydetection.net/dut-omron/) | ||
- [DUTS](http://saliencydetection.net/duts/) | ||
|
||
|
||
## Download model | ||
|
||
- If you want to test the performance of MEPNet, please download the model([Baidu](https://pan.baidu.com/s/1S_uwKUEUIoRMw-p9Ek28zA?pwd=6jyz) [Google](https://drive.google.com/file/d/1-2gtqk9M3Ex9Ou_YtOSFsQmvPWkcOySg/view?usp=sharing)) into `models/RES_Model` folder, and download the pretrained model ([Baidu](https://pan.baidu.com/s/1Lh-MrKSLU1rG6DL45PiqtQ?pwd=pfvi) [Google](https://drive.google.com/file/d/1Y8d2cSZh71oKd4qQ56TTK_sYvhTIHe8G/view?usp=sharing)) into `models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain` folder. | ||
|
||
|
||
## Training | ||
|
||
```shell | ||
python3 train.py # or using bash cmd_train.sh | ||
``` | ||
|
||
|
||
## Testing | ||
|
||
```shell | ||
python3 predict.py # or using bash cmd.sh | ||
``` | ||
- After testing, saliency maps of `PASCAL-S`, `ECSSD`, `HKU-IS`, `DUT-OMRON`, `DUTS-TE` will be saved in `predict_result/` folder. | ||
|
||
## Saliency maps & Pre-Trained model & Trained model | ||
- saliency maps: [Baidu](https://pan.baidu.com/s/1EEqGAK5KU-Frpsvx9GKFDw?pwd=by1m) [Google](https://drive.google.com/file/d/16WoEBpne1mpsa_NEQ1ffFG9v7ZnppqxR/view?usp=sharing) | ||
|
||
- pretrained model: [Baidu](https://pan.baidu.com/s/1Lh-MrKSLU1rG6DL45PiqtQ?pwd=pfvi) [Google](https://drive.google.com/file/d/1Y8d2cSZh71oKd4qQ56TTK_sYvhTIHe8G/view?usp=sharing) | ||
|
||
- trained model: [Baidu](https://pan.baidu.com/s/1S_uwKUEUIoRMw-p9Ek28zA?pwd=6jyz) [Google](https://drive.google.com/file/d/1-2gtqk9M3Ex9Ou_YtOSFsQmvPWkcOySg/view?usp=sharing) | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
|
||
python sleep.py --base 0 --subbase 7.8 --iter 0 | ||
dir=($(ls -A ./models/RES_Model/)) | ||
|
||
python predict.py --name ${dir[-1]} --model CPD_RES_PA --config-file 'config/standard.yaml' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# echo 1 > /proc/sys/vm/drop_caches | ||
# echo 2 > /proc/sys/vm/drop_caches | ||
# echo 3 > /proc/sys/vm/drop_caches | ||
|
||
python sleep.py --base 0 --subbase 0 --iter 0 | ||
python -m torch.distributed.launch --nproc_per_node 2 train.py --config-file ./config/standard.yaml --gpus '0,1' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
from .params import _C as cfg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import os | ||
from yacs.config import CfgNode as CN | ||
|
||
_C = CN() | ||
_C.gpus='0,1' | ||
_C.local_rank=-1 | ||
_C.config_file='' | ||
_C.global_rank=-1 | ||
_C.world_size=1 | ||
_C.sync_bn=True | ||
_C.seed=1000 | ||
_C.print_rate=100 | ||
_C.empty_cache=False | ||
_C.BalancedData=False | ||
_C.model=CN() | ||
_C.model.mid_channel=256 | ||
_C.model.expand=[1.0,1.0,1.0] | ||
_C.model.using_split=False | ||
_C.model.neck=CN() | ||
_C.model.neck.type='RFB_Conv' | ||
_C.model.neck.using_PA=False | ||
_C.model.neck.using_CA=False | ||
_C.model.Edge_Ass=CN() | ||
_C.model.Edge_Ass.type='EdgePro_Auxiliary' | ||
_C.model.Edge_Ass.using_probability=True | ||
_C.model.Edge_Ass.using_canny=False | ||
_C.model.Edge_Ass.using=False | ||
_C.model.Edge_Ass.using_PV=False | ||
_C.model.Edge_Ass.using_edge_SC=False | ||
_C.model.Edge_Ass.inchannel=4 if _C.model.Edge_Ass.using_PV else 3 | ||
_C.model.Edge_Ass.PV_reduction='max' | ||
_C.model.Edge_Ass.PV_patch_size=3 | ||
_C.model.Edge_Ass.PV_normal=False | ||
_C.model.Edge_Ass.mid_channel=4 | ||
_C.model.Edge_Ass.gain=2.0 | ||
|
||
|
||
_C.model.PQ=CN() | ||
_C.model.PQ.type='patch_query2_swin_channel_last.prior_query' | ||
_C.model.PQ.using=False | ||
_C.model.PQ.patch_size=2 | ||
_C.model.PQ.map_size=16 | ||
_C.model.PQ.num_head=4 | ||
_C.model.PQ.using_pretrain=False | ||
_C.model.PQ.position=[True,False,False] | ||
_C.model.PQ.pretrain_path='./models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain/model_199.pth' | ||
_C.model.PQ.using_LA=True | ||
_C.model.PQ.using_prior=True | ||
|
||
_C.model.SR=CN() #salient object reconstruction | ||
_C.model.SR.using=False | ||
_C.model.header='my_aggregation_lite_CSP' | ||
|
||
_C.model.EQ=CN() | ||
_C.model.EQ.using=False | ||
_C.model.EQ.map_size=88 | ||
_C.model.EQ.win_size=4 | ||
_C.model.EQ.num_heads=4 | ||
|
||
_C.model.AF=CN() | ||
_C.model.AF.using=False | ||
_C.model.AF.position=[True,False,False] | ||
|
||
|
||
_C.dataloader=CN() | ||
_C.dataloader.batch_size=16 | ||
_C.dataloader.num_work=8 | ||
_C.dataloader.train_size=352 | ||
_C.dataloader.test_size=352 | ||
_C.dataloader.using_random_size=True | ||
_C.dataloader.train_path=['../SOD_Data/DUTS-TR/DUTS-TR-Image/', '../SOD_Data/DUTS-TR/DUTS-TR-Mask/'] | ||
_C.dataloader.test_path=['../SOD_Data/DUTS-TE/DUTS-TE-Image/', '../SOD_Data/DUTS-TE/DUTS-TE-Mask/'] | ||
|
||
|
||
_C.solver=CN() | ||
_C.solver.type='AdamW' | ||
_C.solver.epoch=70 | ||
_C.solver.momen=0.9 | ||
_C.solver.weight_decay=1e-5 | ||
_C.solver.lr=1e-4 | ||
_C.solver.base_batchsize=12 | ||
_C.solver.min_lr=_C.solver.lr*0.001 | ||
_C.solver.warmup_lr=_C.solver.lr*0.0001 | ||
_C.solver.init_epoch=10 | ||
_C.solver.warmup_epoch = 3 | ||
_C.solver.t_mul=2 | ||
_C.solver.using_clip=True | ||
_C.solver.clip=0.5 | ||
_C.solver.lr_rate=[0.1,1,1] | ||
_C.solver.decay_rate=0.5 | ||
_C.solver.lr_step=[10,30,70] | ||
|
||
|
||
_C.loss=CN() | ||
_C.loss.combination=['DICE','SSIM', 'WBCE'] | ||
_C.loss.combination_p=[[], [], []] | ||
_C.loss.loss_weight=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1] | ||
_C.loss.loss_scale=CN() | ||
_C.loss.loss_scale.number=4 | ||
_C.loss.loss_scale.combination=[[1],[1,2],[1,2,3],[1,2,3]] | ||
_C.loss.using_dice=True | ||
_C.loss.edge=CN() | ||
_C.loss.edge.using=_C.model.Edge_Ass.using | ||
_C.loss.edge.gamma=1.5 | ||
_C.loss.edge.gains=[1,8,64] | ||
_C.loss.edge.weights=[1,1/4,1/16] | ||
_C.loss.edge.stage=[10,20,40] | ||
_C.loss.edge.max_rate=0.9 | ||
_C.loss.edge.base_size=288 | ||
_C.loss.edge.loss_type='mse' | ||
_C.loss.using_Deepest_loss=False | ||
_C.loss.using_normal=True | ||
_C.loss.using_filter_interpolate=False | ||
_C.loss.mask_binary=False | ||
_C.loss.AFloss=CN() | ||
_C.loss.AFloss.weight=0.1 | ||
_C.loss.AFloss.Four=False | ||
_C.loss.using_Rweight=True | ||
_C.loss.Rweight=0.5 | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
seed: 5555 | ||
model: | ||
mid_channel: 196 | ||
expand: [2.25, 1.75, 1.25] | ||
neck: | ||
type: RFB_Lite | ||
Edge_Ass: | ||
using: True | ||
mid_channel: 4 | ||
using_PV: False | ||
inchannel: 3 | ||
using_edge_SC: True | ||
PV_reduction: None | ||
PV_normal: True | ||
gain: 1.0 | ||
header: my_aggregation_lite_CSP | ||
PQ: | ||
using: True | ||
patch_size: 4 | ||
map_size: 16 | ||
num_head: 8 | ||
type: prior_query2_channel_last_L_attention_rm_SA.prior_query | ||
using_pretrain: True | ||
pretrain_path: models/pretrained/prior_query2_channel_last_L_attention_rm_SA/pretrain/model_199.pth | ||
position: [True,True,True] | ||
|
||
|
||
|
||
loss: | ||
combination: ['DICE','SSIM', 'FocalLoss','MSELoss','WBCE','Contrast_Loss','Contrast_Loss','Contrast_Loss','Contrast_Loss'] | ||
using_Deepest_loss: True | ||
combination_p: [[],[],[2],[],[],[0.95],[0.8],[0.65],[0.5]] | ||
loss_weight: [0.0625, 0.125,0.125, 0.25,0.25,0.25, 0.5,0.5,0.5, 0.5,0] #sum(combination)+edge_loss | ||
loss_scale: | ||
number: 4 #11,22,44 | ||
combination: [[1],[1,2],[1,2,3],[1,2,3]] | ||
edge: | ||
using: True | ||
gamma: 1.5 | ||
gains: [1,48,64] | ||
weights: [1, 0.25, 0.0625] #1,1/4,1/16 | ||
stage: [10,20,40] | ||
max_rate: 0.9 | ||
base_size: 288 | ||
loss_type: bce | ||
using_normal: True | ||
|
||
|
||
dataloader: | ||
batch_size: 12 | ||
|
||
solver: | ||
init_epoch: 10 | ||
epoch: 70 | ||
lr_rate: [0.1,1.5,1] | ||
decay_rate: 1.0 | ||
lr_step: [10,30,70] |
Oops, something went wrong.