With the continuous improvement in the fields of computer processing power,sensor technology and artificial intelligence,the research and application of autonomous vehicles have shown a rapid development.For autonomous vehicles,it is extremely important to predict accurately the intention of pedestrians to cross the street,the specific time and path of pedestrian crossing and other key information to make corresponding decisions and ensure the safety of driving.Therefore,the purpose of this paper is to study the pedestrian crossing intention recognition method based on deep learning framework on signal-free road sections,and to carry out corresponding research work combining pedestrian detection,tracking and crossing intention determination,as follows:First,to address the target occlusion problem in the multi-person case,we deeply investigate the target occlusion characteristics and explore target detection algorithms based on the crowd multi-target occlusion characteristics by combining them.To this end,we introduce two core components,the head feature module and a new classification criterion,to improve the Faster R-CNN algorithm to significantly improve its detection performance in the multi-person target occlusion case.This study has theoretical and practical implications for the study of target detection algorithms in multi-person target occlusion scenarios.Second,based on the research of the proposed detection algorithm for pedestrian occlusion,a pedestrian tracking algorithm based on multi-feature association is designed to solve the problems of pedestrians being easily occluded and high similarity.By fusing target appearance and motion characteristics,the Hungarian algorithm is used to achieve target matching and then pedestrian tracking.The test results on MOT16 dataset show that the algorithm proposed in this paper can effectively improve the pedestrian tracking and has certain theoretical and practical application value.Third,the global context of the interaction between the target pedestrian and the scene is underutilized and the optimal fusion strategy of different sensor data is addressed,which leads to the limitations of the model in pedestrian crossing prediction.Therefore,a new neural network structure is introduced to fuse spatio-temporal features of detection and tracking and other multi-source data to predict the intention of pedestrian crossing.The structure employs an attention mechanism and a stack of recurrent neural networks to fuse different phenomena such as RGB image sequences,semantic segmentation masks and self-vehicle speed in an optimal way.Through detailed ablation and comparative experimental studies,the optimal architecture is derived and achieves state-of-the-art performance in JAAD pedestrian action prediction benchmark tests,and extensive experiments demonstrate the effectiveness of the proposed approach.The research results of this paper can be applied to the field of autonomous vehicles by identifying and predicting the crossing intention of target pedestrians,thus laying the foundation for driving behavior and safety decisions of autonomous vehicles. |