index.html

﻿<!DOCTYPE html>
<!-- saved from url=(0035)https://captain-whu.github.io/SCD/ -->
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
	
	<title>SCD</title>
	<!-- <link rel="stylesheet" href="http://cdn.static.runoob.com/libs/bootstrap/3.3.7/css/bootstrap.min.css"> -->
	<link rel="stylesheet" href="./SCD_files/bootstrap.min.css">
	<!-- <link rel="stylesheet" type="text/css" href="css/mystyle.css"> -->
	<script src="./SCD_files/jquery.min.js.下载"></script>
	<link rel="stylesheet" href="./SCD_files/bootstrap.min.js.下载">
	<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=AM_HTMLorMML-full"></script>

</head>

<body>
	<div class="container">
		<div class="content">
			<h1 style="text-align:center; margin-top:60px; font-weight: bold">
				Semantic Change Detection with Asymmetric Siamese Networks
			</h1>
			<p style="text-align:center; margin-bottom:15px; margin-top:20px; font-size: 18px">
				<a href="http://www.captain-whu.com/yangkunping.html" target="_blank">Kunping Yang</a>,
				<a href="http://www.captain-whu.com/xia_En.html" target="_blank">Gui-Song Xia</a>,
				<a target="_blank">Zicheng Liu</a>, 
				<a href="http://cs.whu.edu.cn/teacherinfo.aspx?id=254" target="_blank">Bo Du</a>,
				<a href="http://www.captain-whu.com/yangwen.html" target="_blank">Wen Yang</a>,
				<a href="https://www.dsi.unive.it/~pelillo/" target="_blank">Marcello Pelillo</a>,
				<a href="http://www.lmars.whu.edu.cn/prof_web/zhangliangpei/rs/index.html" target="_blank">Liangpei Zhang</a>.
			</p>

		</div>

		<!-- <br><hr> -->

		<div class="row">
			<div class="span6 offset2">
				<ul class="nav nav-tabs">
					<br>
				</ul>
			</div>
		</div>

		
				&nbsp;&nbsp;
				
				&nbsp;&nbsp;
				
				&nbsp;&nbsp;
		<div class="row">
			<div class="span12">
				<h2>
					- Introduction -
				</h2>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">

					Given two multi-temporal aerial images, semantic change detection aims to locate the land-cover variations and identify their categories with pixel-wise boundaries. The problem has demonstrated promising potentials in many earth vision related tasks, such as precise urban planning and natural resource management. Existing state-of-the-art algorithms mainly identify the changed pixels through symmetric modules, which would suffer from categorical ambiguity caused by changes related to totally different land-cover distributions. In this paper, we present an <i> asymmetric siamese network </i> (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures, which involve different spatial ranges and quantities of parameters to factor in the discrepancy across different land-cover distributions. To better train and evaluate our model, we create a large-scale well-annotated <i> SEmantic Change detectiON Dataset </i> (SECOND), while an <i> adaptive threshold learning </i> (ATL) module and a <i> separated kappa </i> (SeK) coefficient are proposed to alleviate the influences of label imbalance in model training and evaluation. The experimental results demonstrate that the proposed model can stably outperform the state-of-the-art algorithms with different encoder backbones.
				</p>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
					In a summary, our main contributions in this work are threefold.
					<ul style="text-align:justify; font-size: 16px; text-indent: 0em">
						<li> We propose an asymmetric siamese network, <i> i.e.</i> ASN, to factor in the discrepancy across different land-cover distributions in each multi-temporal image, which can alleviate the categorical ambiguity
						caused by asymmetric changes through feature pairs from locally asymmetric architectures. </li>
						<li>We create a large-scale semantic change detection dataset, <i> i.e.</i> SECOND, to better train deep models and as a new benchmark for the SCD problem. This SECOND also enables us to distinguish changed
						regions between the same land-cover category.</li>
						<li>We design an adaptive threshold learning module and a separated kappa to alleviate influences of label imbalance during the training and evaluation, which
						can adaptively revise the threshold deflections and fix unreasonable scores computed with traditional metrics, <i> e.g.</i> OA and kappa coefficient, respectively.</li>
					</ul>
				</p>
			 <br>
			</div>

			<div class="span12">
				<table style="width:47%;" align="left">
					<tbody>
						<tr>
						<td style="text-align: center;"><a href="./SCD_files/Semantic_Change_Detection.pdf"><img src="./SCD_files/pdf-icon.png" width=55px height=55px class="img-responsive center-block"> <br> Paper</a></td>
						<td style="text-align: center;"><a href="https://drive.google.com/file/d/1mN8jzCKKK27p3ODGoDgepjiRYGQpB34u/view?usp=sharing"><img src="./SCD_files/no-sql-01-1.png" align="bottom" width=55px height=55px class="img-responsive center-block"> <br> SECOND Dataset </a></td>
						<td style="text-align: center;"><a href=" "><img src="./SCD_files/codes.png" width=55px height=55px class="img-responsive center-block"> <br> Coming Soon</a></td>
						</tr>
					</tbody>
		 		</table>
			</div>
		</div>

		<br>

		<div class="row">
			<div class="span12">				
				<h2>
				- Asymmetric Siamese Networks (ASN) - （<a href=" ">codes coming soon</a>）
				</h2>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
					For addressing the asymmetric properties of SCD problem by exploiting siamese networks, we propose an <i> Asymmetric Siamese Network </i> (ASN) to extract changed pixels through two modules, <i> i.e.,</i> <i>asymmetric Spatial Pyramid</i> (aSP) and <i>asymmetric Representation Pyramid</i> (aRP). Leveraging designed convolution sequences of different structures, aSP and aRP obtain features through several siamese feature pyramids calculated from input images.
				</p>

				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
					Specifically, we design weighted dense connected topological architectures, where each node is linked to a feature pair across siamese feature pyramids. Although the whole architecture is symmetric, most of these feature pairs are obtained by widely different structures, namely locally asymmetric, where dynamic branch weights further adjust the module structure according to each input. Containing designated receptive fields and representation capabilities, these asymmetric feature pairs are able to focus on various spatial ranges and depict scenes of diverse complexities.
				</p>
				<img src="./SCD_files/Process_raw7.png" width="450px" class="img-responsive center-block">
				<p style="text-align:justify; font-size: 14px">
Fig.1 The illustration ASN, where several convolutional sequences and squeeze gates are utilized to obtain asymmetric feature pairs generated from widely different structures.
</p>

			</div>
		</div>

		<br>
		<div class="row">
			<div class="span12">				
				<h2>
				- SEmantic Change detectiON Dataset (SECOND) - （<a href="https://drive.google.com/file/d/1QlAdzrHpfBIOZ6SK78yHF2i1u6tikmBc/view?usp=sharing">available at Google Drive</a>）
				</h2>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
In order to set up a new benchmark for SCD problems with adequate quantities, sufficient categories and proper annotation methods, in this paper we present SECOND, a well-annotated semantic change detection dataset. To ensure data diversity, we firstly collect 4662 pairs of aerial images from several platforms and sensors. These pairs of images are distributed over the cities such as Hangzhou, Chengdu, and Shanghai. Each image has size 512 x 512 and is annotated at the pixel level. The annotation of SECOND is carried out by an expert group of earth vision applications, which guarantees high label accuracy. For the change category in the SECOND dataset, we focus on 6 main land-cover classes, <i> i.e. </i>, <i> non-vegetated ground surface, tree, low vegetation, water, buildings </i> and <i> playgrounds </i>, that are frequently involved in natural and man-made geographical changes. It is worth noticing that, in the new dataset, non-vegetated ground surface  (<i>  n.v.g. surface </i> for short) mainly corresponds to <i> impervious surface </i> and <i> bare land</i>. In summary, these 6 selected land-cover categories result in 30 common change categories (including <i> non-change </i>). Through the random selection of image pairs, the SECOND reflects real distributions of land-cover categories when changes occur.
				</p>
				<img src="./SCD_files/data_samples.png" width="1100px" class="img-responsive center-block">
				<p style="text-align:justify; font-size: 14px">
Fig.2 Several samples of our proposed SECOND dataset. Color white indicates \emph{non-change} regions, while other colors indicate different land-cover categories. Ground truth for SCD can be obtained by comparing the annotated land-cover categories.
</p>

			</div>
		</div>

		<br>
		<div class="row">
			<div class="span12">
				<h2>
				- Seperated Kappa (SeK) - (<a href="./SCD_files/Metric.zip">Codes available</a>)
				</h2>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
					In order to alleviate the influence of label imbalance, we utilize Mean Intersection Over Union (mIOU) to evaluate BCD results and propose a Separated Kappa (SeK) coefficient to evaluate SCD results.
				</p>
			</div>
		</div>

		<div class="row">
			<div class="span12">
				<h4 style="text-align:left; margin-bottom:10px; margin-top:10px; font-weight: bold; font-style: italic">
					- Mean Intersection Over Union
				</h4>
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
				Specifically, given a confusion matrix <img src="http://latex.codecogs.com/gif.latex?Q=\{q_{ij}\}" width="60px"/>, we define categorical IOU and mIOU as :
				</p>
				<img src="http://latex.codecogs.com/gif.latex?\textrm{IOU}_1 = q_{11}/(\sum\limits^C_{i=1}{q_{i1}}+\sum\limits^C_{j=1}{q_{1j}}-q_{11})" width="225px" class="img-responsive center-block">
								
				<img src="http://latex.codecogs.com/gif.latex?\textrm{IOU}_2 = \sum\limits^C_{i=2}\sum\limits^C_{j=2}{q_{ij}}/(\sum\limits^C_{i=1}\sum\limits^C_{j=1}{q_{ij}}-q_{11})" width="225px" class="img-responsive center-block">
				and
				<img src="http://latex.codecogs.com/gif.latex?\textrm{mIOU}=\frac{1}{2}(\textrm{IOU}_1+\textrm{IOU}_2)." width="170px" class="img-responsive center-block">
					<p style="text-align:justify; font-size: 16px; text-indent: 2em">
						Categorical IOU measure the identification of <i> non-change </i> pixels and evaluates the extraction of changed regions. Compared with Overall Accuracy (OA), mIOU considers more about changed regions.
					</p>

				<h4 style="text-align:left; margin-bottom:10px; margin-top:10px; font-weight: bold; font-style: italic">
									- Separated Kappa
				</h4>
								
				<p style="text-align:justify; font-size: 16px; text-indent: 0em">
				On the other hand, the true positive of <i> non-change </i> pixels <span style="font-size:15px">`q_{11}`</span> always dominates the calculation of Kappa. Thus, we separate <span style="font-size:15px">`q_{11}`</span> in the calculation of SeK. We also utilize categorical IOU to further emphasize changed pixels.
					Specifically, we define
				</p>
				
				<img src="http://latex.codecogs.com/gif.latex?\textrm{SeK} =e^{(\textrm{IOU}_2-1)} \cdot (\hat{\rho}-\hat{\eta})/(1-\hat{\eta})," width="235px" class="img-responsive center-block"> with

				<img src="http://latex.codecogs.com/gif.latex?\begin{align}
				\hat{\rho} &=\sum^C_{i=2}{q_{ii}}/(\sum\limits^C_{i=1}\sum^C_{j=1}{q_{ij}}-q_{11}), \nonumber\\
				\hat{\eta} &=\sum^C_{j=1}{(\hat{q}_{j+} \cdot \hat{q}_{+j})}/(\sum^C_{i=1}\sum^C_{j=1}{q_{ij}}-q_{11})^{2}, \nonumber
				\end{align}" width="250px" class="img-responsive center-block">

			
				<p>
				where the exponential form enlarges the discernibility compared with simple multiplication for better models. We collect visual scores between 0 and 1 <span style="font-size:12px">`w.r.t.`</span>each result from 11 remote sensing image interpretation experts. As illustrated in the following figure, compared with Kappa and OA, models with apparently poor performances on small change categories would get low scores in SeK no matter how good the performances on BCD are. Moreover, the Mean Square Error (MSE) between SeK and human scores is 0.003. While, MSE <span style="font-size:12px">`w.r.t.`</span>OA and Kappa are 0.212 and 0.028 respectively, which further validates the rationality of SeK.
				</p>
				<img src="./SCD_files/Label2.png" width="1100px" class="img-responsive center-block">

				<p style="text-align:justify; font-size: 12px">
					Fig.3 Given a change detection data sample, <i> i.e. </i> a pair of images and a sequence of change detection results, we collect visual scores between 0 and 1 <i> w.r.t. </i> each result from 11 remote sensing image interpretation experts. Meanwhile, we calculate evaluation scores of each result based on OA and Kappa. Compared with OA and Kappa, SeK is more in line with human scoring in SCD problem.
				</p>
			</div>
		</div>	

		<div class="section bibtex" style="margin-left:-15px;">
			<div class="span12">

			<h3 style="text-align:left; margin-bottom:10px; margin-top:20px; font-weight: bold">
				Citation
			</h3>

			<pre>@Misc{Yang2020SECOND,,
title={Semantic Change Detection with Asymmetric Siamese Networks}, 
author={Kunping Yang and Gui-Song Xia and Zicheng Liu and Bo Du and Wen Yang and Marcello Pelillo and Liangpei Zhang},
year={2020},
eprint={arXiv:2010.05687},
}</pre>
			</div>
		</div>

		<br>
		<div class="row">
			<div class="span12">

				<h3 style="text-align:left; margin-bottom:10px; margin-top:20px; font-weight: bold">
						Contact
				</h3>
				<p>
					If you have any problem on the use of SECOND dataset or Asymmetric Siamese Network model, please contact:
				</p>
					<ul>
						<li>Kunping Yang at <strong>kunpingyang@whu.edu.cn</strong></li>
						<li>Gui-Song Xia at <strong>guisong.xia@whu.edu.cn</strong></li>
					</ul>
				<br>
				<br>
				<br>
			</div>
		</div>

<!-- 		<div class="row">
			<div style="text-align:center; margin-top:0; margin-bottom: 20px;">
				<embed id="map" src="http://rf.revolvermaps.com/f/f.swf" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" wmode="transparent" allowScriptAccess="always" allowNetworking="all" width="150" height="75" flashvars="m=0&amp;i=5dp1mfnunae&amp;r=10&amp;c=fffdc0" loop="true" autostart="False"></embed> 
				<img class="img-responsive center-block" src="http://rf.revolvermaps.com/js/c/5dp1mfnunae.gif" width="1" height="1" alt="" value="True"/>
				<a style="font-size: x-small;"> Copyight@2020, Captain</a>
				<a href="http://www.revolvermaps.com/livestats/5dp1mfnunae/"></a>
			</div>
		</div> -->


</div></body></html>