Font Size: a A A

Research And Implementation Of Multi-style Table Detection And Lightweight

Posted on:2022-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z X NingFull Text:PDF
GTID:2518306482989459Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an important part of optical character detection(OCR),table detection plays an irreplaceable role in information extraction.With the popularization of office software,the styles of tables are becoming more and more abundant,which also brings huge challenges to the task of table inspection.At present,the fast-paced work style guides artificial intelligence product developers to provide a convenient and efficient service model,and is committed to deploying products on the mobile terminal to provide convenience for more people.This article aims at multi-style table detection and explores a lightweight solution that can provide almost lossless services on edge devices.The main research contents of this paper are as follows:(1)Exploring a multi-style table detection method: Aiming at the problem of reduced model performance caused by multi-style tables,this paper proposes an improved YOLOv5 model based on adaptive anchor and attention mechanism(A-YOLOv5).First,an adaptive anchor frame generation algorithm for processing image texture features is proposed to generate anchor frames,and then the training sample selection strategy is optimized through the intersection of the anchor frame and the table,and the quality of the training samples is improved.Finally,the channel attention mechanism is used.Improve the backbone network of the model to make better use of high-quality training samples to enhance the ability to detect different styles of tables.Compared with the baseline model,the F1 value of A-YOLOv5 on the ICDAR 2013 and ICDAR 2019 data sets has increased by 2.3% and 2.7%,respectively.In addition,this paper proposes for the first time a hand-built multi-style table detection data set(Finance Open Table,FOT),which has richer table styles compared with other data sets.Experiments based on this data set also show the effectiveness of the A-YOLOv5 model proposed in this paper in multi-style table detection.(2)Explore a set of model lightweight schemes suitable for multi-style table detection: Although the multi-style table detection method has achieved good results,the model parameters are large and difficult to deploy on the mobile terminal.In order to optimize the model structure,this paper first uses the channel-based network pruning method to prun the model.However,the feature extraction ability is weakened after the model is pruned,and the performance is affected to a certain extent.In order to make up for the ability of model feature extraction,this paper proposes a knowledge distillation method based on regional attention mechanism.The F1 value of this method on FOT and ICDAR 2019 is increased by 1.3% and 2.1%compared with the original model.The results show that this method can significantly improve the performance loss caused by pruning.(3)Developed a mobile terminal system suitable for multi-style table detection:In order to further verify the effect of the solution,this paper implemented a set of table detection and recognition system that can be deployed on the mobile terminal based on Torch Script.In addition to table detection based on the A-YOLOv5 model,the system also integrates table information extraction and model personalized upgrade functions.The system can be deployed on current common mobile devices,and can reach a processing speed of about 20 FPS on general-purpose devices.
Keywords/Search Tags:Table Detection, Yolov5, Model Lightweight, Attention Mechanism
PDF Full Text Request
Related items