| Mathematical expressions(MEs),as a unique type of text,are widely prevalent in educational and office scenarios.Handwritten mathematical expression recognition(HMER),as an important problem in the filed of pattern recognition,has broad application prospects in intelligent education,automated office,and human-computer interaction scenarios.With the development of deep learning,breakthroughs have been made in the field of text recognition.However,the problem of HMER has not been completely solved.Compared with general text,there are complex structures in MEs and two-dimensional positional relationships between symbols.The scales of symbols in different positions of a ME are inconsistent,resulting in difficulties in multi-scale recognition.Meanwhile,the similarity in shape of some of the symbols in handwritten mathematical expressions(HMEs)brings the challenge of similar symbol distinguishing.In addition,the positional relationships between symbols have specific mathematical meanings,and recognizing MEs requires parsing their tree structures to determine the positional relationships between symbols.To address the above issues,this thesis focuses on multi-scale recognition,similar symbol distinguishing,and tree structure parsing,with specific content and contributions as follows:(1)For the multi-scale recognition problem in HMEs,this thesis proposes a scale augmentation method and a drop attention method.The scale augmentation method enriches the scale diversity of the data by randomly scaling it,which trains the model to recognize ME symbols at different scales.Since multi-scale symbols impact the accuracy of attention mask generation in the model,which subsequently affects the recognition performance of the symbols,the drop attention method enhances the recognition ability of the model when the attention mask generation is biased by randomly adjusting the weights of the generated attention masks during training.Experimental results show that the proposed method improves the robustness of the model and solves the multi-scale recognition problem.(2)For the similar symbol distinguishing problem in HMEs,this thesis proposes a path signature feature extraction method,a language model correction method,and a model ensemble method.The path signature feature extraction method extracts local features of symbol writing trajectories to distinguish similar symbols;the language model correction method refines the recognition results of easily confused symbols based on the context symbol recognition results predicted by the recognition model;the model ensemble method combines the recognition results of similar symbols from multiple models,alleviating the model fitting bias problem and introducing dynamic time warping to solve the problem of inconsistent prediction sequence lengths in different models.Experimental results show that the proposed method solves the problem of similar symbol distinguishing and significantly improves the recognition performance.(3)For the tree structure parsing problem in HMEs,this thesis proposes a branch parallel decoding tree-based model method.This method models MEs as tree-structured data objects,recognizes symbol categories and positional relationships between symbols in MEs through tree models,and constructs nodes and edges of ME trees.Meanwhile,this method exploits the independence of tree branches to decode the branches of the ME tree in parallel,reducing the number of decoding steps and alleviating the long sequence decoding problem of the attention mechanism.Experimental results show that the proposed method solves the tree structure parsing problem and improves the structure parsing performance and recognition performance of the model.This thesis proposes new methods and ideas for HMER in the era of deep learning,hoping that these methods and ideas can promote new development in the field of text recognition for ME recognition and other structured text recognition. |