| With the deepening research of artificial intelligence,the autonomous driving technology in the automotive industry has developed rapidly.Environmental perception is the premise of fully automatic driving.At present,the environmental perception of autonomous vehicles is mainly realized by machine vision technologies such as laser radar,millimeter-wave radar and visual sensors,and there are few applications of auditory perception.However,traffic sound contains important physical information in the auditory dimension during driving.Usually,in addition to the visual level,events in traffic are often filled in the environment in the form of audio.Traffic sound event detection methods is the core technology to construct vehicle auditory perception system.This paper will introduce the main problems and development direction in the field of traffic sound event detection from the development process of traffic sound event detection,and further analyze the advantages and disadvantages of traditional technology and put forward the model method of this paper.The main research contents of this paper are as follows:(1)Traffic sound datasets(TSD)are constructed.At present,the online traffic sound at home and abroad cannot meet the needs of the experiment.Therefore,this paper designs and completes the collection of traffic sound and traffic background sound.The 44.1KHz traffic sound collected in the laboratory is combined with the traffic background sound of a traffic intersection in Chongqing to simulate the traffic sound in the real driving environment.Finally,a traffic sound data set covering 40,924 audios such as siren sound and motorcycle sound was produced.(2)The Automobile Vehicle Sound Event Detection Baseline System(AVSEDBs)was established.The construction method of convolutional neural network and the extraction method of Spectrogram Image Feature(SIF)for traffic sound event detection are deeply studied.Based on the VGGnet model,a baseline system for sound event detection in autonomous vehicle traffic environment is built,and the detection effect of the baseline system is preliminarily verified on the TSD dataset.The test shows that the detection accuracy of the baseline system is 92.54 %,and its detection accuracy still has some room for improvement.In addition,the test found that the detection system has a low recognition rate for motorcycle,screaming,and truck,and the neural network part has problems such as large parameters and long training time,so it is necessary to further optimize the baseline system.(3)The neural network parameters,structure and attention mechanism of the automatic vehicle sound event detection system are optimized.In order to solve the problems of detection efficiency and speed of the baseline system,this paper studies an Inter-Block Channel Algorithm(IBCA)by referring to the Residual block thinking.The IBCA and Attention Mechanism(CBAM)are combined to the baseline system to solve the problem that the baseline model is less sensitive to some traffic sound features such as trucks and buses.The neural network pruning method is used to further optimize the hyperparameters such as neurons.The optimized detection system(AVSEDs)has the characteristics of high precision and strong adaptability.(4)The traffic sound event detection system is deployed to the Raspberry Pi 4B embedded device to complete the hardware implementation of the vehicle traffic sound event detection system.In order to verify the actual test ability of the detection model system proposed in this paper on the mobile terminal,the model trained in this paper is transplanted and deployed to the Raspberry Pi development board to realize the realtime detection of traffic sound.The specific steps include: Tensor Flow Lite data format conversion of the model to make the model lightweight;at the same time,the converted tflite model is matched with the Raspberry Pi format.Then the model data is transplanted to the Raspberry Pi chip;finally,the model deployment is completed.The research results show that through the inter-block fusion channel optimization algorithm,hyperparameter optimization and self-attention mechanism optimization algorithm,on the TSD dataset,the average detection rate of the vehicle traffic sound event detection system for traffic environmental sound events is increased from 92.54 %to 97.18 %,which is 4.64 % higher than the unoptimized model and 6.38 % higher than the traditional SVM machine learning method.It is found that the detection accuracy of the model deployed to the Raspberry Pi hardware system reaches 90.23 %,and the inference speed is 100ms-150 ms.There is a 6.95 % error between the TSD data set and the real environment sound.The main reason for the error is that the data set cannot fully simulate the traffic sound in the driving environment,resulting in a decrease in the accuracy of the model,but the error is still within the acceptable range,which verifies the feasibility of the automatic driving vehicle sound event detection system(AVSEDs)constructed in this paper.In addition,the system also has the characteristics of miniaturization,good practicability and stable operation,which provides a new solution for vehicle auditory perception. |