| Brain tumor is a common heterogeneous intracranial tumor with high incidence and extremely high mortality,which seriously endangers human life and health.Therefore,early diagnosis and treatment of brain tumors are crucial to improving patients’ survival expectations.Magnetic Resonance Imaging(MRI)technology has become one of the main methods for brain tumor detection and diagnosis due to its advantages such as non-ionizing radiation damage and high soft tissue imaging contrast.However,the shape,size,and structure of brain tumors have high variability,and it is difficult to rely on traditional manual segmentation methods to locate them.How to automatically and accurately segment brain tumors is still a challenging task.With the continuous innovation and development of deep learning technology,deep learning-based brain tumor image segmentation methods have demonstrated powerful performance.However,these methods still have some problems:(1)Most models based on convolutional neural networks simply stack the four modalities of MRI images and directly input them into the network without considering the relationships between each modality.(2)Methods based on convolutional neural networks and Transformers have high parameter and computational requirements for processing 3D medical images,making it difficult to be applied on a large scale due to the high demand for training environments.(3)Most deep learning-based models are designed and improved according to specific tasks and have complex structures,resulting in poor model generalization performance.To address the above problems,this paper proposes two convolutional neural network-based brain tumor MRI image segmentation models,namely BEA-UNet and MPTr-UNet.The effectiveness of these proposed models was validated on a 3D brain tumor MRI dataset.The main research content of this paper is as follows:(1)We propose a convolutional attention network model,BEA-UNet,based on the 3D U-Net with a dual-encoder structure.The model adds an additional encoding path to the 3D U-Net,forming a dual-encoder structure.Each encoder receives two modalities of MRI images for feature information extraction,allowing it to focus on the features of different tumor regions.Finally,a Feature Aggregation(FA)module aggregates the deep semantic information of different modalities.To prevent too much local information loss during convolutional downsampling,a Space Channel Shortcut Attention(SCSA)module is embedded in each skip connection.This allows the network to focus more on the shallow details lost during downsampling and enhance the interaction between the shallow details and deep semantic information during decoding.Extensive experiments were conducted on the Bra TS 2019,2020,and 2021 datasets to evaluate the segmentation performance of the proposed model.The results show that,compared with most convolutional neural network-based models,BEA-UNet has better performance in brain tumor MRI image segmentation tasks.(2)We propose a convolutional neural network model,MPTr-UNet,with a multi-path Transformer encoder structure.The model adopts a U-shaped encoderdecoder structure.First,the encoder uses continuous downsampling convolutional layers to extract local information from the image.Then,two consecutive 3D MultiPath Transformer(3D MPTr)modules are used at the bottom of the encoder to further extract global information.At the end of the 3D MPTr module,feature interaction is used to fully fuse the convolutional local information and global contextual information to obtain more comprehensive feature representations.Finally,the decoder combines the encoded feature maps with the same resolution from the skip connections and obtains the final full-resolution segmentation result through layerby-layer upsampling.Extensive experimental results on the Bra TS 2019,2020,and2021 datasets show that MPTr-UNet has significant advantages over convolutional neural network-based models and is also highly competitive compared to models that combine convolutional neural networks and Transformers. |