| Cracks are a common type of distress in asphalt road surfaces.The research on automated identification technology for asphalt road surface cracks is of great practical significance,which can provide prerequisite support for traffic road maintenance work.Due to the limitations of traditional digital image processing and recognition methods in most complex asphalt pavement crack identification tasks,and considering that classical convolutional neural networks still have considerable room for improvement in asphalt pavement crack identification performance,this thesis focuses on crack classification filtering and semantic segmentation tasks using Transformer networks as the foundation.The main research work is summarized as follows:(1)This thesis investigates asphalt pavement crack image classification filtering models based on Transformer networks,involving Vision Transformer(ViT)and its variant models.The model consists of five parts: pavement image input layer,linear mapping layer,embedding layer,Transformer computation layer,and classification result output layer.Firstly,the asphalt pavement images preprocessed with grayscale treatment,data augmentation,and other techniques are processed into patches and flattened through the pavement image input layer.Then,the linear mapping layer generates vectors,and positional encoding and special characters are added in the embedding layer for image information concatenation.After normalization and multi-head attention operations in the Transformer computation layer,the multi-layer perceptron heads in the classification result output layer make crack presence or absence judgment on the input asphalt pavement images.Experimental results on the self-collected SA dataset show that ViT and its variant models achieve better filtering performance on asphalt pavement crack images,with higher true negative rate,accuracy,and F1 score than the ResNet50 network.(2)To address the issue of low accuracy of classical convolutional networks in crack segmentation,a TISU_Net-based asphalt pavement crack image segmentation model is proposed.TISU_Net is an improved version of TransUNet,which is a U-shaped network with encoder-decoder structure.The encoder part of TISU_Net combines Inception V3 with Transformer Block to achieve finer extraction of crack image features.To mitigate issues such as overfitting and gradient vanishing during model training,ResNet skip connections and SE attention modules are concatenated between the encoder and decoder,allowing the network to focus more on crack features.To address the problems of insufficient training of shallow layers and slow convergence speed,deep supervision mechanism is introduced in the decoder to assist with pixel-level classification.Experimental results on the Gaps384 dataset and SA dataset show that the proposed TISU_Net network achieves higher segmentation performance,with higher m IOU values compared to other methods,and demonstrates adaptability to complex pavement images with interference and noise. |