| Intelligent driving cars obtain signals through a variety of sensors to perceive the surrounding environment of the vehicle and possible dangers in order to make driving decisions.Compared with other signals,the video signal contains rich semantic information,the acquisition equipment cost is low,and the acquisition process is convenient;however,the video signal also has problems such as unstable background,vehicle bumping,and object obstruction,which brings special challenges to the understanding of traffic scenarios.This paper addresses the problem of risk estimation in traffic scenarios,puts forward the principle of classification of traffic risk,and designs a method for estimating the risk of the traffic scene in the video.This paper first uses the YOLO algorithm and the DeepSort algorithm to build a detection and tracking model to detect and track traffic targets in traffic scenes,obtain target location,category,and trajectory information,and analyze their possibilities for use in intelligent driving systems.Provide support for understanding of traffic scenarios and risk estimation.Then,on the basis of traffic target detection and tracking,the concept of scene complexity is proposed.Scene complexity includes static complexity and dynamic complexity.Static complexity is calculated based on information such as the category,location and size of traffic objects,and dynamic complexity is calculated based on the orderly degree of the trajectory of traffic targets.In addition,based on the complexity of the scene and the actual driving experience of the driver,the principle of risk classification of traffic scenes is proposed,which is divided into four categories:low risk,medium risk,high risk and hazardous risk.This paper also collect videos taken by on-board cameras from a variety of sources,provide them with risk level annotations,and produce a traffic risk estimation dataset.Finally,this paper propose a traffic risk estimation method based on two-stream convolutional neural network.The method includes two parts:video preprocessing and traffic risk estimation network.Video preprocessing uses sparse time sampling,optical flow calculation,and other steps to organize the original video signal into the input of two streams of the neural network.The traffic scene risk estimation network includes two parts:the spatial stream network and the temporal stream network.They extract the high-level appearance features and high-level motion features in the video respectively.We merge the output vectors of the two streams and use the classifier to obtain the risk level of the traffic scene.In the method,we also designed the objective-guided spatial attention module,channel attention module and weighted temporal-shift module to help the network understand the traffic scene.The traffic scenario risk estimation method proposed in this paper has been verified in real datasets,and it has good performance in terms of accuracy and speed.It proves that the proposed method is effective and feasible. |