As a permanent connection technology,welding is widely used in rail transportation,aerospace,machinery manufacturing and other fields,and its welding quality directly affects its performance.At present,welding inspection is still mainly based on manual inspection.With the development of national intelligence,automation and flexibility,people are studying in the direction of automatic identification of welding defects.At present,the single welding pool image is often used to detect welding defects,and the relationship among welding current,voltage and welding arc sound is ignored.Considering the correlation between different information in the welding process,a multi-modal welding defect recognition method is proposed,and a convolution neural network with three branches is constructed to process the welding pool picture,welding sound and welding current and voltage respectively,and their complementarity is used to improve the recognition performance.On the basis of convolution neural network,a two-channel attention mechanism is added,and a cross-modal attention mechanism is proposed,which uses the characteristics of sound and current and voltage to enhance the key areas of welding pool image.In order to verify the stability and reliability of the model,images,sounds,current and voltage data of welding pool were collected from the actual high-speed train body production workshop,and a data set containing 10 kinds of welding defects and actions was built.Experiments are carried out on this data set,and the results show that the model can effectively detect different welding defects and actions.Firstly,the influence of welding current,voltage and sound on the network is verified by experiment.When identifying the swing,slag inclusion and welding deviation defects,the model of three modal information is improved by 6.6%,6.6% and 12.2% respectively compared with the model of single modal information of welding pool image.Secondly,in the convolutional neural network without any attention mechanism,the F value of defect recognition is above 91.6%.Furthermore,when the output of the shallow layer,the layer near the input layer,is fed into the CBAM and the output of the deep layer,the layer near the output layer,is fed into the CMA..both have better recognition performance,in addition,they are highly complementary to each other.Finally,in order to explore the effect of the attention mechanism on the key position information of the focused welding pool,the model was visually analyzed,and a heat map was added to the image of the welding pool to see the image area of the molten pool more clearly.To sum up,it can be seen that while the accuracy rate P,recall rate R and F values satisfy industrial requirements,which provides a reference for real-time welding classification and identification and is helpful for welding quality assessment. |