| As the most typical mode of slow-moving traffic,pedestrian traffic has the advantages of being environmentally friendly,pollution-free,and improving the quality of life.However,pedestrians are in a vulnerable position in the traffic system,and traffic accidents can easily lead to serious injury or death.Based on more than 80,000 pedes trian accidents,this paper uses the data mining modeling method to identify the influencing factors of pedestrian fatalities.There are two challenging problems that need to be overcome when modeling fatal accidents:First,the number of pedestrian fatalities accounts for a small proportion of the total pedestrian accidents,which is a kind of unbalanced data.Biased and easy to ignore leads to some valuable information,for which it is necessary to balance the data.Second,the traditional regression model is limit ed by the assumption of the established probability distribution and linear superposition relationship,and it is difficult to reveal the human-vehicle-road-environment multi-factor interaction mechanism that affects pedestrian fatalities.However,emerging data mining technology does not need to assume a model representation paradigm.It has more advantages in accident mechanism mining.Therefore,according to the idea of "balancing first,mining later",the main research work of this thesis is as follows:1.Aiming at the characteristics of unbalanced accident data,a data balance method based on Kmeans-SMOTE is proposed.The Kmeans algorithm is used to cluster the data in advance to solve the problem of uneven sampling of the minority samples during the calculation of the synthetic minority oversampling technology(SMOTE).The experimental results show that compared with the original data,it is found that the balanced data is more accurate in predicting fatal accidents during modeling.2.A combined factor mining and importance ranking method of pedestrian fatal accidents based on decision tree and improved association rules is proposed.The importance ranking of various influencing factors is realized through decision tree technology;the interaction rules of multiple factors of human,vehicle and road environment are identified through association rule technology,and the chi-square test method is introduced into association rule technology to solve redundant rules.The problem.3.A case study was conducted based on more than 80,000 pedestrian accidents.The results show that:(1)Compared with the rules obtained by decision tree based on local search,the rules generated by association rules in the form of global search are more comprehensive;(2)The proposed Kmeans-The SMOTE data balance method ca n effectively improve the data performance of pedestrian accident data;(3)the proposed combination mining scheme of association rules and decision tree can effectively improve the mining effect and reduce the generation of redundant rules;(4)the degree of impact on pedestrian fatalities The most important factors are pedestrian age,vehicle type,driver factor,and accident time;(5)The important association rules that have a significant impact on pedestrian fatalities and injuries are elderly pedestrians colliding with large vehicles while crossing the road,Elderly pedestrian accidents after 10 p.m.,collision between elderly people and trucks at intersections,and elderly pedestrians on multiple lanes are struck by trucks. |