Font Size: a A A

Research On 3D Lane Detection Algorithm Based On Spatial And Temporal Fusion

Posted on:2024-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2542307064983419Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
This paper proposes a 3D lane detection method based on spatio-temporal information fusion(referred to as STLane3D).Compared with traditional 2D lane detection methods,STLane3 D can directly predict lane positions in 3D space,and leverages the strong spatiotemporal continuity between consecutive frames to enhance the perception of lane features.Compared with existing 3D lane detection methods,STLane3 D performs 3D lane detection in the camera coordinate system,simplifying the model and reducing the real-time information requirements for real-time camera pose.To achieve STLane3 D,this paper "upgrades" from the traditional 2D lane detection task,gradually from 2D front view lane detection to 2.5D BEV lane detection,and then to 3D lane detection,and finally to spatio-temporal 3D lane detection.The specific research contents are as follows:(1)This paper studies conventional 2D lane detection methods,that is,lane detection models based on camera original images(i.e.,front view).The pixel-level semantic segmentation model and the fixed lane width semantic segmentation model are compared and analyzed through experiments on the Apollo simulation dataset,and they are used as the firststage network of the subsequent 3D lane detection model.(2)This paper explores various perspective transformation methods from the front view to the bird’s-eye view,including LSS algorithm,inverse perspective transformation algorithm,deformable attention module,and multi-layer perceptron MLP,and builds corresponding BEV lane detection models.The BEV true values are generated on the Apollo simulation dataset for verification,and the advantages and disadvantages of different perspective transformation methods are compared and analyzed to select a suitable perspective transformation method for the 3D lane detection model.(3)Based on the BEV lane detection model,this paper further builds a multi-frame 3D lane detection model that fuses spatio-temporal information based on real-world 3D lane datasets ONCE and Open Lane,and proposes a feature-level pre-alignment method,a spatiotemporal information fusion method,and a 3D lane loss function.Compared with the state-ofthe-art baseline models on the ONCE and Open Lane datasets and extensive ablation experiments are conducted to verify the effectiveness of the proposed method.STLane3 D achieves state-of-the-art results on both ONCE and Open Lane,with F1 scores of 77.53% and50.55%,respectively.Due to the simplified model,the algorithm also achieves further improvements in real-time performance,with an inference speed of 63 FPS(single frame)on a single RTX3060 GPU.(4)This paper explores a anchor-free road structure perception scheme based on local semantic maps.A high-precision map-based local road structure labeling form is adopted to address the difficulties in annotating data and the inherent defects of anchor prior in existing schemes.A no-anchor version of the detection scheme is designed based on STLane3 D,and it is verified in the benchmark test of the Nu Scences dataset,achieving state-of-the-art results.The average AP reaches 46% in three sub-categories,including crosswalk,lane,and curb.Due to the simplified model structure,the algorithm also achieves significant improvements in realtime performance,with an inference speed of 65 FPS on a single RTX3060 GPU.In summary,the proposed STLane3 D method achieves excellent performance in the 3D lane detection task and significant improvements in real-time performance,with high practical value.This paper also explores the future development direction of data-driven perception schemes,providing new ideas and methods for related research fields.
Keywords/Search Tags:Lane detection, Spatio-temporal fusion, BEV perception, End-to-end, Visual 3D perception
PDF Full Text Request
Related items