| Image segmentation technology based on Convolutional Neural Networks(CNN)has been effective in segmenting different objects or specific regions within images.Existing CNN-based segmentation techniques generally perform well on conventional targets and those with clear boundaries.However,they exhibit shortcomings when dealing with small targets or those with fuzzy boundaries.This is particularly evident in medical image segmentation tasks,where several challenges remain to be addressed.Firstly,CNN-based methods struggle with targets that have indistinct edges,as well as in extracting features from small targets.Secondly,the presence of complex backgrounds and fuzzy target boundaries in medical images results in lower segmentation accuracy.Lastly,CNN-based segmentation techniques lack global modeling capabilities,leading to suboptimal performance in segmenting many types of targets.To address these issues,this paper proposes three methods for key feature awareness and enhancement.The contributions of this paper are outlined as follows:To address the issues of indistinct target edges and insufficient feature extraction for small objects,this paper proposes a Sensitive Feature Selection Module(SFSM)to enhance the perception of critical features.This method learns the distribution characteristics of each pixel on different channels in the same position by using the characteristics of the convolution layer in the previous stage.Then each pixel on different channels is re-weighted with the obtained weight,so that the network can better perceive the pixel characteristics of object boundaries and small objects in the process of feature extraction.Finally,the information obtained by SFSM is combined with the original features to further improve the feature representation,which will help to obtain more accurate segmentation results.The SFSM is integrated into FCN,Deep Labv3,and Double U-Net network frameworks,and validated on medical image datasets such as ISIC for skin diseases.Experimental results demonstrate that embedding SFSM into Double U-Net achieves a 0.16% improvement in accuracy on the ISIC dataset,thereby proving that SFSM effectively enhances the segmentation performance of the network.To address the issue of low segmentation accuracy in medical images with complex backgrounds and fuzzy boundaries,this paper proposes the Enhanced Feature Extraction Network(EFEN)to improve the network’s ability to perceive critical features.EFEN is designed based on the U-Net architecture,incorporating a feature re-extraction structure to enhance feature extraction capabilities.Additionally,during the decoding process,this method improves skip connections by using positional encoding and cross-attention mechanisms to reduce the interference of noise and irrelevant information.By embedding positional information,EFEN can capture both absolute and relative information between targets.The cross-attention mechanism strengthens useful information while diminishing irrelevant information,allowing the network to accurately perceive critical features at each skip connection.This results in clearer feature representation during the decoding process,thereby reducing the impact of fuzzy information on target boundaries in medical images.Experimental results on the CVC-Clinic DB,ISIC task 1,and Data Science Bowl challenge datasets demonstrate that EFEN outperforms U-Net and several commonly used methods.For instance,compared to U-Net,EFEN achieves improvements of 5.23% and 2.46% in DSC metrics on the CVC-Clinic DB and ISIC datasets,respectively.Compared to Double U-Net,EFEN shows improvements of 0.65% and 0.3% in DSC metrics on the CVC-Clinic DB and ISIC datasets,respectively.In addressing the limitations of CNN-based methods in medical image segmentation,which have strong inductive bias but lack global modeling capabilities,and Transformer-based methods,which excel at global feature extraction but have insufficient inductive bias,this paper proposes the Swin-IBNet method that combines CNN and Transformer architectures.SwinIBNet effectively balances local and global feature extraction,leveraging the strengths of both approaches.The decoding part of this method utilizes the decoder from Swin-Unet.The encoder of Swin-IBNet integrates a Feature Fusion Block(FFB)and a novel Multi-Scale Feature Aggregation(MSFA)module to facilitate information interaction.This method has been validated on public datasets including Synapse,ISIC,and the Automated Cardiac Diagnosis Challenge(ACDC).Experimental results demonstrate that Swin-IBNet outperforms Swin-Unet and several commonly used methods.Notably,on the Synapse dataset,Swin-IBNet achieves a DSC score that is 3.45% higher than that of Swin-Unet.Additionally,Swin-IBNet achieves an HD score of 17.46,indicating a higher similarity to the true shape. |