Attention U-Net:

摘要

  • 提出新的关住门(AG)模型,

IHA-Unet

介绍

  • (First of all,as the complexity of brain structure, the feature of ICH is very similar to that of skull, which brings interference to both feature extraction and segmentation.)ICH 的特征与颅骨的特征非常相似,这会给特征的提取和分割带来干扰。
  • (Additionally, the location and size of cerebral hemmorrhage lesions on CT images are varible,which further reduces the accuracy of segmentation, for example, the erea of intraventricular hemorrhage is long strip and slightly low density, while the one of brainstem hemorrhage is very small.) 脑出血病灶在CT图像上的位置和大小不一,这引入降低了分割的准确性,比如脑室内出血区域呈长条形,密度略低,而脑干出血区域很小。并且,ICH种类繁多,很难构建一个能处理不同规模ICH的深度学习模型。
  • Moreover, due to continuous convolutions, a puny cerebral hematoma(early ICH) tends to be lightly lost and difficult to recover.(持续的卷积,微小的脑血肿容易丢失且难以恢复)
  • there is a large imbalance between foreground and background pixels, which further reduces the accuracy of segmentation.(前景和背景像素之间存在较大的不平衡,这也会降低分割的准确性),在作者使用的公开数据集中,大部分的血肿面积占整个图像的比例不超过3%。
  • In order to address the aforementioned issues, in this paper, a novel end-to-end model is developed, which can work well on tiny intracerebral hemorrhage segmentation.(为了解决上述问题,本文提出了一种新颖的端到端脑出血分割模型,该模型能够很好低进行微小脑出血分割)
  • In response to the loss of information on tiny hemaomas, we introduce the Residual Hybrid Atrous Convolution(RUAC) strategy to integrate atrous convolution. (针对微小血肿信息丢失问题,引入残差混合亚光卷积对亚光卷积进行积分)Generally speaking ,in the shallower layers, the larger the receptive field is, the sparser the convolution kernel is , it cannot cover critical local information and consequently a great part of the detailed information diminishes, which is very detrimental for tiny intracerebral hemorrhage segmentation.(在较浅的层中,感受野越大,卷积核越稀疏,无法覆盖关键的局部信息,从而导致很一部分详细信息衰减,这对微小脑出血分割是十分不利的。) Thus, by configuring RHAC modules with different atrous rates at different encoder stages, multi-scale contextual information can be aggregted without loss of resolution.(因此,通过在不同的编码区阶段配置具有不同属性率的RHAC模块,可以在不损失分辨率的情况下聚合多尺度上下文信息) Specifically,for shallow stages, we set a smaller reception field,compared with deep ones.(具体来说,在浅层阶段,设置了一个较小的接受场,而不是在深层阶段) Besides, the multi-object optimization function is utilized as the loss during training process to handle the imbalance problem of the imbalance problem of the foreground and background pixels.(在训练过程中,利用多目标优化函数作为损失来处理前景和背景像素的不平衡问题) Furthermore, in order to facilitate better segmentation performance of intracranial hemorrhage, we also employ intermediate supervision with a lightweight heed.(为了更好地分割颅内出血,还采用了轻量级头部的中间监督)

方法

We employ the modified U-shape structure to improve segmentation performance, where pretained ResNet-34 is exploited as the encoder backbone.

Before information fusion, Residual Hybrid Atros Convolution(RHAC) strategy is adopted to gather multiscale context information of the objects.

In addition to the first stage of the encoder, RHAC modules with different configurations are placed at each stage of the encoder as the network deepens.

Then, a spatial context information extractor is utilized to capture spatial information of local and global features.

UItimately, the final result is acquired by inserting an attention mechanism with intermediate supervision in the decoding stage to recover the resolution of feature map.

IHA-Net 结构:采用改进的U形结构来提高分割性能,利用预训练的ResNet-34 作为编码器主干。在信息融合之前,采用残差混合属性卷积策略收集目标的多尺度上下文信息,除了编码器的第一级外,随着网络的加深,在编码器的每一级都放置了不同配置的RHAC模块,利用空间上下文信息提取器捕获局部和全局特征的空间信息。最后,通过在解码阶段插入具有中间监督的注意力机制来恢复特征映射的分辨率,从而获得最终结果。

image-20230824175059643 血肿大小的分布,其中每个血肿在每组中的比例相对平衡。横轴表示血肿区域与图像区域的比值。纵轴表示一定大小的血肿样本与整个样本的比例

==A. Residual Hybrid Atrous Convolution module==

Since the cerebral hemorrhage data used for training and test many come from the patients with different types of ICH at different stages, there may be different scales of ICH regions.(来自不同类型、不同阶段的脑出血患者,脑出血区域可能有不同的尺度) **Considering that the continuous stacked down-sampling in the encoder stages will lead to a severe loss of relevant detailed information, small object information cannot be reconstructed.(考虑到编码器阶段的连续叠加下采样会导致相关详细信息的严重丢失,小目标信息无法重构)**,Hence, how to dwindle the loss of crucial information on small objects has become an important issue we need to address.To mitigate the influe-nce of the above problems, we propose the Residual Hybrid Atrous Convolution(-RHAC) module to further capture multi-scale context features for intracanial he-morrhage of different sizes.RHAC block is integrated by several hybrid atrous co-nvolutional layers, max-pooling layers and residual connections. (RHAC块由多个混合属性卷积层、最大池化层和残差连接组成)

image-20230824183747982

four parallel branches ,three of which are equipped with convolutions with diferent diated ratios, and one branch with the max-pooling layer.A 1x1 convolution is added after convolution of each branch to realize integration of information, and then the rectified linear unit(ReLU) is used as the activation function. Moreover, we utilize residual connections to optimize network.

​ With the depth of networks increasing, the resolution of feature maps gradually decreases, resulting in information contained in the shallow network sparser.Therefore, it is necessary to use a larger receptive fielf to aggregate spatial information.**Compared with conventional convolution, the noteworthy of atrous convolution is the expansion of the receptive field without alteringf feature maps resolution, which is helpful to reduce the loss of semantic information to some extent.(在不改变特征图分辨率的情况下扩展了感受野,这在一定程度上有助于减少语义信息的损失)**,However, only using the convolution with large dilated rate is not benefiical to the segmentation of small objects.(仅使用膨胀率较大的卷积不利于小目标的分割),In order to reduce the loss of the internal data and spatial hierarchical information of small objects, we configure RHAC modules with different atrous rates in the different stages of encoders.To be specific,a relatively small receptive field is formed by configuring a smaller atrous rate in the shallow layer, and a larger atrous rateis configured in the deep layer to yield a relatively large receptive field.(浅层配置较小的属性率形成相对较小的感受野,深层配置较大的属性率形成相对较大的感受野)In this way, the information of small targets in the shallow layer is less likely to be lost.RHAC block ,more abstract features of different sizes for hemorrhage regions are extracted and multi-scale context information is fully captured, enhancing local information prediction.(提取出血区域更多不同大小的抽象特征,充分捕获多尺度上下文信息,增强局部信息预测)

==B. Spatial Context Information Extractor==

From above modules, more semantic information of shallow features is introduced. To further embed more spatial inforamtion in the deep features, we utilize like spatial pyramid pooling for accurate ICH region segmentation from the CE-Net model .(从以上模块中引入了更多浅层特征的语义信息,为了进一步在浅层特征中嵌入更多的空间信息,利用空间金字塔池从CE-Net 模型中精准分割ICH 区域) The same feature map is pooled in parallel with different sizes, so that the feature map of different sizes can be obtained. Then low-resolution feature map is upsampled directly via bilinear interpolation to obtain a feature map with the same resolution as the original feature map. Finally, the different levels of features are concatenated together as a final pyramid pooling global feature.(对相同的特征图进行不同大小的并行池化,得到不同大小的特征图,对低分辨率特征图直接进行双线性插值上采样,得到与原特征图分辨率相同的特征图,最后,将不同级别的特征连接在一起,形成最终的金字塔全局特征) 为了减少参数,池化后进行1x1 卷积,用于捕获空间上下文信息的感受野大小分别为2x2,3x3,5x5,6x6.This module enables the fusion of local and global features, enriching the expression capability of feature maps and facilitating the segmentation of hematoma regions with different sizes in images.该模块实现了局部和全局特征的融合,丰富了特征图的表达能力,便于对图像中不同大小的血肿区域进行分割。

==C. Attention Mechanism==

Apart from designing various networks architectures,the mechanism named attention, which is motivated by the fact that human recognize what we see not by a full scene but firstly focus on the most important objects has widely been applied in CNN architectures and obtained splendid performance.**As a simple but effective structure that improves the representational power and attention of relevant features by suppressing non-important features, among these attention mechanisms, channel-wise attention and spatial-wise attention inherit the capability of capturing inter relationship to denote which part of features is more considered by CNN.(作为一种简单而有效的结构,通过抑制非重要特征来提高相关特征的表征能力和关注度,在这些注意机制中,通道型注意和空间型注意继承了捕捉相互关系的能力,以表示CNN更关注哪一部分特征)**,Likewise,these two attention mechanisms attempt to emphasize input features by separately placing channel-wise and spatial-wise attention modules where the attention output is rescored to input feature with element-wise multiplication.

image-20230824221720373

image-20230824222015418

==D. Intermediate Supervision==

When training the deep network, with the update of the model parameters using the gradient descent method, the loss of the output layer is greatly reduced by multiple layers, namely, the gradient vanishing occurs.** To solve this problem, we calculate the output loss at the soecified decoder stage, and sum up these loss values when updating the weights.(计算指定解码阶段的输出损失,并在更新权值时将这些损失值相加)**,This method guarantees the normal update of the parameters and suppresses the gradient vanishing problem.Among them, the calculated losses in these network layers become auxiliary losses.The purpose of intermediate supervision is to allow more adequate training of the shallow layers to avoid gradient disappearance and slow convergence.(这些网络层的计算损耗成为辅助损耗,中间监督的目的是额为了对浅层进行更充分得到训练,以避免梯度消失和收敛缓慢) By using additional intermediate supervision signals to help intensify tiny hematoma features and prevent information loss in deep layers(通过使用额外的中间监督信号来帮助强化微小的血肿特征,防止深层的信息丢失)

==E. Multi-object function for joint optimization==

in our dataset, due to the proportion of hematomas in the images is much smaller than the background, the positive and negative classes of samples are extremely unbalanced.In this situation, the ordinarily used pixel-by-pixel cross-entropy loss function may give rise the training procedure dominated by the negative class, thereby reducing the effectiveness of the network.(血肿比例远远小于背景,正负类样本极不平衡)使用逐像素交叉熵损失函数可能会产生负类主导的训练过程,从而降低网络的有效性),The focal object optimization function was originally used in the field of object detection to solve the sample imbalance problem by reducing the weights of a large number of negative samples in the training process.

image-20230825124612217

Directional Connectivity-based Segmentation of Medical Images

摘要

提出观点:生物标记物分割中的解剖一致性对许多医学图像分析任务至关重要,实现解剖一致性的一个有用的方式是将像素连通性纳入像素间关系模型;以往的连通性建模工作忽略了潜在空间中丰富的通道方向信息,作者证明了从共享潜在空间中有效地纠缠方向子空间可以显著增强基于连接的网络中的特征表示;作者提出一种用于分割的方向连接建模方案,该方案可以解耦、跟踪和利用整个网络的方向信息。代码:https://github.com/Zyun-Y/DconnNet

介绍

图像中的解剖一致性可以用拓扑属性来表示,例如像素连通性和邻接性。基于图的方法通过直接建模像素或区域之间的相互信息,长期以来被用于纠正拓扑和几何误差,但这些技术通常依赖于人工定义的先验,不易推广到各种各样的应用中。

​ 典型的分割网络将问题建模为纯像素分类任务,并使用分割掩码作为唯一的标签。但是,这种逐像素建模方案是次优的,因为它没有直接利用像素间关系和几何属性。因此,这些模型在预测中可能导致低空间相干性(即对具有相似空间特征的相邻像素的预测不一致)。特别是,当应用与高噪声/伪像的医疗数据时,较低的空间一致性可能导致拓扑问题。长期以来,像素连通性的概念被用于确保数字图像中分离和连通性的基本拓扑对偶性。使用连通性掩码作为训练标签有几个优点:在问题建模方面,使用连通性掩码本质上将问题从逐像素分类更改为连通性预测,从而对感兴趣的像素之间的拓扑表示进行建模和增强;在标签表示方面,连接掩码在三个方面提供了更多的信息:首先,连接掩码在像素的连接之间存储分类信息,并且具有像素间关系感知;其次,稀疏表示边缘像素;第三,它包含了丰富的定向信息渠道。因此,用连通性掩模训练的网络在潜空间中既有分类特征(通过连通性来体现),也有方向性特征,每一种特征都形成了特定的次替空间。

在以往的研究中,这两组特征是通过共享的网络路径同时学习的,这可能导致潜在空间高度耦合,并引入冗余。另外,从共享潜在空间中有效分离有意义的子空间已被证明可以有效地解释特征之间的依赖性/独立性。

受潜在空间解纠缠思想启发,作者提出一种新的基于方向连通性的分割网络,从共享潜在空间中解纠缠方向子空间,并利用提取的方向特征来增强整体数据表示。解缠过程是由一种称为子路径方向激励(SDE)的基于子路径切片的模块进行的。利用具有两个自顶向下交互解码流的交互式特征空间解码器(IFD),以从粗到精的方式应用基于方向的特征增强。最后,我们提出了一种新的基于标签大小分布的加权方案,缓解了医疗数据集中常见的数据不平衡问题。通过对不同公共医学图像分析基准的实验,证明了DconnNet相对于其他最先进方法的优越性。

相关工作

像素连通性

在拓扑学中,像素连通性描述了相邻像素之间的相互关系。在深度学习中的图像分割中,基于连通性的分割网络使用连通性掩码作为标签,定义为8通道掩码,每个通道表示原始图像上的一个像素是否是与其相邻像素在特定方向上属于同一兴趣类。

潜在空间

在这项工作中,我们利用连通性掩模的固有特性,提出了一种简单而有效的基于子路径切片的方法来从共享潜在空间中解纠缠方向子空间,随后用T-SNE可视化演示了我们解纠缠过程的有效性

方法

由于不同像素类别和方向之间的连通性,基于连通性的网络潜在空间中存在两组特征:分类方向。每一组特征在潜在空间中形成其特定的子空间。在单路径连接网络中,这两个子空间是高度耦合的(图2),从而产生低判别特征。我们证明了有效的解纠缠和方向空间的有效利用可以增强连通性模型中的整体特征表示。

在连接掩码中,不同的通道表示像素连接的不同方向。因此,随着网络的深入,它自然会在通道之间存储定向信息。基于这一特性,可以通过通道操作捕获和操纵方向特征。具体来说,我们提出SDE从潜在空间中解出通道方向特征,然后提出IFD提取不同层的方向嵌入,并使用它们以自关注的方式增强整体特征表示。

DconnNet的总体结构如图所示。