| As an essential tool for intelligent traffic management,traffic scene analysis technology can provide precise data support and a reliable decision-making basis for traffic management departments of public security organs.However,current research on traffic scene parsing technology primarily focuses on target detection,lacking a comprehensive analysis of the dynamic information of the scene and the connection between scene targets,which brings certain limitations to the traffic management department of public security when dealing with actual traffic problems.Additionally,the road traffic scene is complex and variable,making it difficult to use a single sensor to obtain a comprehensive and accurate perception of the road traffic scene.Therefore,this paper adopts the strategy of Li DAR and camera fusion to study the following traffic scene parsing methods.Faced with a large number of traffic scene data,it is necessary to use the traffic target detection technology with fast detection speed,high precision and less computing resources to analyze the data.Therefore,the paper proposes two vehicle detection algorithms: GS-YOLO,a lightweight vehicle detection algorithm based on RGB images,and GS-YOLO3 D,a vehicle detection algorithm based on camera and Li DAR data fusion.GS-YOLO is based on YOLOX-s and improves it by using depthwise separable convolution,designing Ghost CSP structure,and adding attention mechanism.While maintaining the detection accuracy,the model size is reduced by 46%.GS-YOLO3 D combines GS-YOLO with point cloud data and uses late-fusion to fuse multimodal information.Experimental results demonstrate that the average precision of GS-YOLO3 D is higher than that of GS-YOLO and this algorithm can detect vehicle information in traffic scenes more accurately by fusing camera and Li DAR data.Secondly,the paper proposes GS-YOLO3 DMOT,a vehicle tracking algorithm based on Li DAR and camera fusion,to obtain the dynamic trajectory of vehicles and realize the fusion of2 D tracking and 3D tracking by using multi-modal data.The algorithm uses the DetectionBased-Tracking strategy for vehicle tracking and updates the predicted trajectory state based on the detection results of GS-YOLO3 D.To establish a deep association between trajectories and detections,three data association steps are introduced.Furthermore,a novel track management strategy is proposed to reduce false negatives and false positives,considering the characteristics of multimodal data fusion.Experimental results show that GS-YOLO3 DMOT performs well in multi-target tracking,with a higher HOTA score and a lower ID switch rate than other similar target tracking algorithms.Thirdly,the paper proposes an unbiased scene graph generation method based on causal inference to capture relationships among objects in a scene and combines the vehicle detection results based on camera and Li DAR data fusion to achieve more accurate vehicle positions.The algorithm applies a counterfactual intervention to the causal graph,thereby removing the effects of dataset bias.To make the model better understand the traffic scenes,data augmentation is performed on the traffic scene data in the VG dataset and the detection results of GS-YOLO3 D are used to update the vehicle positions under certain conditions.Experimental results demonstrate that the algorithm effectively addresses the long-tail effect of the VG dataset and performs well in traffic scene parsing.Finally,to validate the feasibility and effectiveness of the research methods proposed in this paper,we conducted experiments using real intersection data for traffic monitoring scene parsing.The results show that the methods proposed in this paper exhibit outstanding performance in efficiently parsing traffic scenes and have practical potential.This provides a valuable essential reference point for public security and traffic management departments to tackle related issues. |