| As an important communication and information transmission tool,text is an essential element in our work and life.Today,text exists in many forms,including images,documents,speech,video,and more.As an important subject in the field of computer vision,image text recognition has a wide range of applications in many task scenarios.Image text recognition includes text detection and text recognition technology.First,the text detection model detects the coordinates of the text bounding box in the image,locates the text,and then uses the text recognition model to identify the text content in each text bounding box.With the development of deep learning technology,many advanced detection and recognition models have been produced in the field of image text recognition,including models for independent text detection and recognition tasks and end-to-end text recognition models,which support the detection and recognition of text in any direction.But the effect is not good in practical application scenarios.In the application scenario of certificate image information recognition,due to the problems of low resolution,complex background,and difficulty in text detection in the certificate image itself,the text in the image also has perspective deformation,blurring,and indistinguishability,complete and correctly identifying all certificate information remains a challenge.Aiming at the problems existing in the current certificate image information recognition scene,the main research contents of this paper are as follows:(1)Aiming at the problems of low resolution,complex background and difficult text detection in the certificate image itself,this paper studies and implements an image text detection algorithm based on multi-scale feature cross fusion,and adds a multi-scale feature cross fusion module in the feature extraction network part,parallel connection of high-resolution and low-resolution feature extraction branches,and staged multi-scale feature fusion,can always maintain high-resolution feature map representation,and effectively supplement the part lost in the process of information transfer,a richer high-resolution feature representation is obtained before the adaptive region proposal network in the base detection model.Experiments on benchmark datasets and self-collected ID cards,graduate and degree certificates datasets show that the proposed algorithm can fuse spatial structure features and semantic information features,maintain the high-resolution representation of feature maps,and compensate for the lack of information,and improve the detection ability of the detection model for text boxes.(2)Aiming at the problems of perspective deformation,ambiguity and indistinguishability of certificate image text,this paper studies and implements an image text recognition algorithm based on Chinese semantic information extraction and channel attention,and adds a semantic information extraction module to the basic recognition model structure,and improves the network structure for feature extraction of the basic recognition model.In the dataset preprocessing stage,the language model pre-trained by Chinese,English,and Arabic numeral corpora is added,the corresponding sentence vector is generated for the label text as an additional semantic feature,and the sentence vector is additionally predicted for the sequence features generated by the recognition model encoder,then calculate the semantic loss,adding semantic information to the recognition model decoder.What’s more,this paper adds an improved channel attention mechanism to the recognition model encoder,so that the model pays more attention to the text area in the image and suppresses the interference caused by the background.Experiments on self-collected ID cards,graduation and degree certificates datasets show that the proposed algorithm can effectively correct the errors in the recognition results and improve the recognition model’s ability to recognize fuzzy characters.(3)Aiming at the problem of poor recognition effect in the application scenario of certificate information recognition,this paper designs and implements an end-to-end certificate information recognition system,which is mainly used for real-time or batch processing text detection and recognition of certificate images of personal ID cards,graduation certificates and degree certificates,and the structured output is convenient for the automatic identification and filing of identity information and academic degree information.The overall input of the system is a single ID card,graduation certificate or degree certificate image captured or scanned by a mobile phone,or a compressed package formed by compressing multiple certificate image files,and the output is structured text information generated according to the corresponding rules,or a excel which contains multiple certificate information.The system mainly includes a certificate image uploading module,an image preprocessing module,an image text detection module,an image text recognition module and an information structure processing module.The image preprocessing module is used for certificate image detection and rotation angle correction.The image text detection module and the image text recognition module respectively apply the text detection and recognition algorithms proposed above.The information structure processing module is used to the text content structure and output according to the rules.The function and performance test of the system verifies the functional integrity and effectiveness of the system for certificate image text recognition,and can be applied to real personal certificate information identification and information filing projects. |