Font Size: a A A

Deep Learning-based Multi-oriented Object Detection

Posted on:2022-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:M T FuFull Text:PDF
GTID:2492306572980269Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Object detection is one of a fundamental research topic in computer vision,it aims to find all objects of interest in images,and output the locations and categories of them.Different from generic object detection that describe the locations of objects with horizontal bounding boxes,multi-oriented object detection also requires the objects’ orientation.With the development of deep learning,generic object detection has witnessed substantial progress,however,such a horizontal bounding box detection algorithm is not appropriate for detecting objects with obvious orientation property like vehicles in aerial images and scene texts.Detecting them accurately plays an important role in multiple fields such as urban planning,security,traffic forecast,and automatic reading.Existing multi-oriented object detection algorithms are generally based on two basic regression algorithms: vertices regression algorithm and rotated rectangle regression algorithm,whose regressing objectives are coordinates of the vertices of quadrangular bounding boxes and horizontal bounding boxes with a rotated degree.Both of them have inherent problems: vertices regression algorithm needs to sort the vertices explicitly,and cannot reflect the fact that the vertices are out of order in real scenario;rotated rectangle regression algorithm needs to describe the angle as a section,of which the two endpoints have significant differences,and cannot represent the periodic change of angle.The essence of the problem is that one case in real scenario may correspond to multiple cases of regressing objective,resulting in objective ambiguity of “one to more”.On one hand,such an ambiguity causes confusion of regressing objective in training stage,making training hard;on the other hand,it results in prediction inaccuracy in the vicinity of ambiguity points in test stage.To solve the problem of training hard that results from regressing objective ambiguity,we propose a multi-oriented object detection algorithm that based on gliding vertices on horizontal bounding boxes.Compared to vertices regression algorithm,it can reduce the number of ambiguous regressing objects,reducing the difficulty in training,making the prediction more accurate and stable.In response to the problem of prediction inaccuracy in the vicinity of ambiguity points,we propose an obliquity factor-based divide-and-conquer strategy.Obliquity factor can estimate if an object is in the vicinity of ambiguity points,and the divideand-conquer strategy outputs the results specifically according to the estimation,correcting the prediction at ambiguity points,achieving accurately detecting objects in arbitrary orientation.Last but not least,we implement our proposed method on classic Faster R-CNN framework,and we evaluate it on five datasets across three sub-fields: object detection in aerial images,scene text detection,and pedestrian detection in fisheye images.Without any whistles and bells,our method the proposed method achieves superior performances on all benchmarks with high speed.The experiments prove that the ambiguity issue is generic and our method’s simplicity and effectiveness.At the same time,we carry out detailed experimental analysis to each component to prove its function.
Keywords/Search Tags:Object Detection, Multi-oriented objects, Aerial Images, Scene Texts, Fisheye images
PDF Full Text Request
Related items