Font Size: a A A

Research On Deep Learning Based Object Detection For Text And Aerial Image

Posted on:2021-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhuFull Text:PDF
GTID:2392330602994315Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Object detection is a problem which focuses on how to make machines detect target object,then using these messages in the next step.The application of object detection is very extensive,such as smart city construction,express sorting,intelligent remote sensing system,shooting translation and so on.Among them,quadrilateral and curved polygon object detection is more difficult in this research field.The horizontal object detection only needs to output the upper left and lower right vertices' coordinates,while the quadrilateral and curved polygon object detection needs to output multiple vertices'coordinates of object,which greatly increases its difficulty.In text and aerial image object detection,using horizontal bounding box to label and detect these objects leads to more background,it is difficult to obtain the accurate position information.So in these tasks,using quadrilateral or curved polygon to represent position of these objects is necessary.Research on quadrilateral and curved polygon object detection is of great significance for improving practical value of text and aerial image object detection.Due to the development of deep learning,deep learning based methods have been continuously applied in many fields,text and aerial image object detection field have also made great progress.However,the shape and angle of quadrilateral and curved polygon are very complex,and the relationship between object's vertices cannot be de-scribed with simple rules.This problem greatly limits the performance of object detector for quadrilateral and curved polygon object detection tasks.In view of the characteris-tics of quadrilaterals or curved polygons,this paper improves previous algorithms based on the three mainstream representation methods of quadrilateral or curved polygon and gives some representation methods for position information of quadrilateral or curved polygon object.Ambiguity of model loss will not be generated in complex scenes.In order to improve the performance and efficiency of text and aerial image object detec-tion model,this paper makes the following contributions:First of all,considering the difficulty of accurately locating the position of quadri-lateral and curved polygon.Based on the deep learning based object detection method,this paper proposes to calculate object's position information with coordinate points on object's contour.In this paper,we first theoretically analyze vertices' ambiguity of quadrilateral,the redundancy of vertex coordinate information on object contour,and the disadvantages of using the horizontal IoU(intersection over union)to calculate the quadrilateral object IoU.Afterwards,the quadrilateral vertices' ambiguity and the re-dundancy of the coordinate information are solved by locally sliding line-based point regression which improves model's location and makes the output contour informa-tion more accurate.In addition,additional classification and regression are added after model generated the rotated rectangular box in the first step,which makes model's fi-nal output is calculated by the quadrilateral IoU.It can correct the classification error caused by inaccurate calculation of horizontal IoU.Secondly,considering that regressing multiple vertices on object contour will add a large amount of calculation,and it is not necessary to accurately output object's contour,rotated rectangular bounding box is enough in some applications.So in order to reduce the computational complexity of these tasks,this paper proposes a method for repre-senting rotated rectangular box which can reflect the periodicity of the angle.In order to solve the problem of ambiguity caused by the variability of rotated rectangular an-gle,this paper seriously analyzes its periodic variation and proposes an adaptive period embedding method to encode angle into two different periodic vectors,which improves the output angle's accuracy.Further,in order to improve the model's recall rate for long object,the proposed method calculates IoU by cutting down part target box whose length is equal to the candidate box which makes the long object have corresponding positive samples.It can improve model's performance for long object.Finally,as mentioned before,the improved IoU method can improve the perfor-mance of model for long object,but its improvement is limited due to the limited recep-tive field of convolutional neural networks.But the fully convolutional neural networks based instance segmentation method does not need to cover the entire object,how to correctly group the pixels belonging to the same object is major difficulty.For the long object dataset,this paper proposes to treat the center and border of object as a probability map.The border is 0 and the center is 1.The border to the center is a smooth transition.The border of the probability map can accurately represent the contour position infor-mation of object.The growth direction of probability map points to the center of object,and this direction can be used to group pixels belonging to the same object.Moreover,this paper also proposes a parallel computing method to improve the efficiency of the grouping process.
Keywords/Search Tags:Object detection in aerial images, text detection, quadrangle object detection, deep learning
PDF Full Text Request
Related items