| With the outbreak of the COVID-19 in 2020,colleges and universities have to carry out online teaching work due to the epidemic prevention and control work.A large number of non editable image mathematical expressions appear in this teaching process,which has brought great trouble to teachers and students;At the same time,more and more documents use electronic documents for recording.However,due to issues such as device compatibility,most of the expressions in the documents exist in non editable forms such as images,and readers cannot directly edit and copy them.Therefore,the research on mathematical expression recognition has important significance for better developing online education work and facilitating people to reuse mathematical expressions.The main work of this article includes the following three aspects:(1)Proposed a printed mathematical expression character recognition model SE_YOLOv4 based on object detection.A printed mathematical expression dataset(PME)was constructed based on the daily use of expressions.In response to the cumbersome steps of traditional expression recognition,this article proposes using object detection technology to combine the two major steps of character segmentation and character recognition in expression recognition,simplifying the expression recognition steps.In order to improve the recognition accuracy of expression characters,this paper embeds the SE module in the backbone network of the model to make more efficient use of the input mathematical expression images;Secondly,in order to obtain more Semantic information and improve the utilization of features,dense blocks are added to the second half of the backbone network;Fnally,the activation function of the backbone network is improved,and the Hard-Swish function is finally adopted as the activation function of the SE_YOLOv4 model through experimental verification.The SE_YOLOv4 model achieves a recognition accuracy of 98.96% on the PME dataset,which is 1.86% higher than the benchmark model YOLOv4,9.29% and 4.84% higher than the Up-Detr model and Bidet model,respectively.(2)A printed mathematical expression recognition model DDFT(Mathematical Expression Recognition with Decoupled Dynamic Filter and Transformer)based on encoder decoder is proposed.Re annotate the self built dataset PME for training and testing of printed mathematical expression recognition models;The RNN(Recurrent Neural Network)based decoder is prone to the problem of gradient vanishing when recognizing expression sequences.In response to this problem,this paper adopts a Transformer based decoder to effectively alleviate the gradient vanishing problem;Ordinary convolution operations have the disadvantage of Content-agnostict,which can easily lead to the model being unable to extract unique features for a single mathematical character when extracting features.To address this issue,this article uses a decoupled dynamic filter(DDF)to replace the filter in the original standard convolution.The improved convolution is called DDF convolution,This new convolutional method achieves content-adaptive without additional computational burden;Ordinary mathematical expressions are prone to the problem of imbalanced recognition before and after the expression in the recognition process from left to right.In response to this problem,this paper adopts a bidirectional identification training method to alleviate the imbalanced recognition problem of mathematical expressions;Finally,the DDFT model proposed in this article achieved a recognition accuracy of94.00% on the self built dataset PME,which increased by 2.75% and 2.40% compared to the BTTR model and SAN model,respectively.(3)A handwritten mathematical expression recognition model HMCO(Handwritten Mathematical Expression Recognition with Coverage Message)based on optimized Transformer coverage attention has been proposed.The recognition difficulty of handwritten mathematical expressions is slightly higher than that of printed expressions due to personal writing style factors,but their recognition ideas are basically the same.Therefore,the printed recognition model DDFT will be transferred to handwritten expression recognition,and corresponding improvements will be made to adapt to handwritten recognition.Firstly,in terms of encoder,we will continue to use the encoder in the DDFT model,as its improved convolutional method can achieve content-adaptive which is beneficial for extracting unique features of similar characters in handwritten mathematical expressions;Secondly,in terms of decoder,attention refinement block(ARB)is added to the Transformer model.ARB can refine the original attention mechanism of the Transformer model under various coverage mechanisms,and the refined Transformer model can allocate more attention to unresolved characters,this effectively alleviates the problem of insufficient coverage of the Transformer model in recognizing handwritten expressions.The experiment showed that compared with Dense WAP,Dense WAP-TD,BTTR and other models,the HMCO model achieved the best performance,achieving recognition accuracy of 59.88%,60.26%,and 63.58% in the CROHME 2014,CROHME 2016,and CROHME 2019 test sets,respectively. |