| With the development of computer science and artificial intelligence,intelligent transportation technology has made great progress.Environment awareness system is an important part of intelligent transportation system.The environment awareness system can detect and recognize various targets and understand traffic scene information by integrating various sensors(such as cameras,lidar,radar and V2X).At present,mixed traffic and traffic congestion have caused the occlusion of vehicles and pedestrians,which has increased the complexity of traffic road scenes,and also brought great challenges to the intelligent traffic environment awareness system.As one kind of traffic target without any protection measures in road scenes,pedestrians existing in complex scenes are more vulnerable to injury in actual road scenes.Therefore,it is crucial to improve the ability of intelligent transportation system to perceive pedestrians.In complex road environment,there are two factors that affect the ability of intelligent transportation system to perceive pedestrians.One is the diversity of pedestrian occlusion,that is,pedestrians may be occlused by pedestrians,vehicles or other obstacles.The second is the diversity of pedestrian behavior.The size of pedestrians is too small to identify,and the body is flexible.Therefore,the behavior of pedestrians is more abundant.How to accurately detect pedestrian targets,identify behaviors,and judge pedestrian driving intention is an urgent problem in intelligent transportation,especially in occlused scenes pedestrians crossing the streets,where pedestrian activities are most frequent.At the same time,as an intelligent transportation system in base of network communication,there are higher requirement for the real-time algorithm.How to develop "accurate" and "fast" algorithm is a research hotspot in various fields of intelligent transportation.To solve the above problems,this paper studies pedestrian target detection and behavior recognition technology based on human pose skeleton sequence,and conducts in-depth research on 2D human pose estimation based on single perspective,3D human pose estimation based on multi-views and behavior recognition based on human pose skeleton sequence respectively.Specific research contents are as follows:1.Research on light-weight 2D human pose estimation algorithm.In order to ensure the accuracy of the algorithm and reduce the amount of computation and parameters,this paper selects the high-resolution output network as the baseline of the 2D human pose estimation method,which maintains the high-resolution output representation and multi-scale fusion.We propose a light-weight backbone network IDPNet.In this paper,an identical residual block and dense layer parallel module,IDP module is proposed as the basic network module,which can effectively reduce the number of parameters and the amount of computation.An intralevel block fusion representation head is introduced to fuse the output features of each stage in the high-resolution branch.Without using pretrain model,the detection accuracy is guaranteed,and the number of parameters and calculation amount are greatly reduced,which can meet the requirements of algorithm accuracy and real-time performance in the intelligent driving scene.At the same time,in order to improve the application range of the model,several versions of the model with different requirements for accuracy and speed are proposed.2.Research on 3D human pose estimation algorithm based on multi-view.In order to solve the pose estimation problem of pedestrian joint occlusion based on single perspective,this paper adopts the multi-view 3D human pose estimation strategy to carry out research.We propose a multi-view 3D human pose estimation model CTP network.The CTP model runs directly in the 3D voxel space.The detection head projects the 2D joint features from all views into the 3D space for voxelization,and then regress the center point and 3D bounding box of the pedestrian.The pose regression module regards 3D bounding box as a new voxel space and regress the center point of joints to obtain a complete 3D pose.This method does not need to match the same pedestrian with the association of different joints without using nonmaximum inhibition.The model is simple and efficient.It can effectively solve the problem of pedestrian pose estimation in occlusion scenes.It is also suitable for multi-person scenes.This method has obtained competitive results in terms of detection accuracy,which is better than previous methods in some aspects.3.Research on Skeleton-based multi-stream adaptive-attentional sub-graph convolution network for behavior recognition.Based on GCN,this paper studies the skeleton-based behavior recognition algorithm.In order to pay more attention to the correlation between different body parts of pedestrians in different behaviors,a subgraph based on depth-first tree traversal sequence method is proposed to learn the correlation among different body parts.The performance of the algorithm can be impoved by improving the attention to the correlation of different parts of human body.Based on adaptive graph convolution,the weights of each part of the graph model are learned adaptively,which can be parameterized and optimized together with other parameters in the training process.ECA channel attention module is introduced and embedded into each graph convolutional network layer to avoid the impact of feature graph dimension reduction on model performance.In addition,the multi-flow framework is used to integrate the physical structure information of the human body and the motion information of each part to further improve the accuracy of the algorithm.The multi-stream adaptive attentional subgraph convolution method proposed in this paper achieves competitive results in large public datasets,and is superior to previous methods in some aspects,improving the accuracy of behavior recognition.4.Construction of pedestrian behavior recognition datasets for intelligent traffic scenarios.In this paper,vehicle-mounted cameras are used to collect pedestrian behaviors in multiple road scenes.Some public pedestrian behavior datasets in traffic scenes are used to extract pedestrian behaviors.Finally,all RGB video samples are integrated for classification and labeling,and a pedestrian behavior recognition dataset based on single view of vehiclemounted cameras is constructed.Based on one intersection scene,multi-view cameras are used to collect pedestrian behaviors in traffic scenes,and a multi-view based pedestrian behavior recognition dataset is constructed to solve the problem of pedestrian occlusion.Different behavior corresponding to 3D human skeleton sequence are classified and labeled for behavior recognition algorithm training.Identify and predict pedestrian behavior types in traffic scenes. |