PraNet论文阅读
摘要
we propose a parallel reverse attention network (PraNet) for accurate polyp(息肉) segmentation in colonoscopy images. Specifically, we first aggregate the features in high-level layers using a parallel partial decoder (PPD). Based on the combined feature,we then generate a global map as the initial guidance area for the following components. In addition, we mine the boundary cues using the reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues. Thanks to the recurrent cooperation mechanism between areas and boundaries, our PraNet is capable of calibrating some misaligned predictions, improving the segment-ation accuracy.(1.聚合深层次特征,生成全局图;2. 使用反向注意模块来挖掘边界线索,建立区域于边界线索之间的关系)
介绍
we first predict coarse areas and then implicitly model the boundaries by means of reverse attention. There are three advantages to this strategy, including better learning
ability, improved generalization capability, and higher training efficiency. (首先预测粗糙区域,然后通过反向注意力隐式地对边界建模,该策略具有学习能力强、泛化能力强和训练效率高的优点)
贡献
(1) We present a novel deep neural network for real-time and accurate
polyp segmentation. By aggregating features in high-level layers using a parallel
partial decoder (PPD), the combined feature takes contextual information and
generates a global map as the initial guidance area for the subsequent steps.
To further mine the boundary cues, we leverage a set of recurrent reverse at-
tention (RA) modules to establish the relationship between areas and boundary
cues. Due to this recurrent cooperation mechanism between areas and bound-
aries, our model is capable of calibrating some misaligned predictions.(使用PPD来聚合深层次特征,生成全局图,RA模块建立区域与边界线索之间的关系)
(2) We introduce several novel evaluation metrics for polyp segmentation and present a comprehensive benchmark for existing SOTA models that are publicly available.
(3) Extensive experiments demonstrate that the proposed PraNet outperforms
most cutting-edge models and advances the SOTAs by a large margin, on five
challenging datasets, with real-time inference and shorter training time.
模型框架
Code: [PraNet](PraNet/lib at master · DengPingFan/PraNet · GitHub)
Parallel Partial Decoder(PPD)
class aggregation(nn.Module): |
compared with high-level features, low-level features demand more computa-
tional resources due to their larger spatial resolutions, but contribute less to
performance. Motivated by this observation, we propose to aggregate high-level
features with a parallel partial decoder component(低特征需要更高的计算资源,性价比低). More specifically, for an input polyp image I with size h ×w, five levels of features {fi, i = 1, …, 5} with resolution [h/2k−1, w/2k−1] can be extracted from Res2Net-based [12] backbone network. Then, we divide fi features into low-level features {fi, i = 1, 2} and high-level features {fi, i = 3, 4, 5}. We introduce the partial decoder pd(·) [29], a new SOTA decoder component, to aggregate the high-level features with a paralleled
connection. The partial decoder feature is computed by PD = pd(f3, f4, f5), and
we can obtain a global map Sg
Reverse Attention Module(RA)
# ---- reverse attention branch_4 ---- |
It can only capture a relatively rough location of the polyp tissues, without structural details (see Fig. 1). To address this issue, we propose a principle strategy to progressively mine discriminative polyp regions through an erasing foreground object manner [27,4]. Instead of aggregating features from all levels like in [4,13,36,33], we propose to adaptively learn the reverse attention in three parallel high-level features. In other words, our architecture can sequentially mine complementary regions and details by erasing the existing estimated polyp regions from high-level side-output features, where the existing estimation is up-sampled from thedeeper layer.()