| Landslide,as one of the most serious types of geological disasters in our country,poses a great threat to people’s life and property.Especially in recent years,with the rapid development of society,various engineering activities are frequent,which have an increasing impact on the geological environment.In addition,various extreme weather and climate provide a basis for the development of landslides.The prediction of regional landslide susceptibility can reveal the potential landslide zones in the space,which is conducive to strengthening the risk control of geological disasters.Therefore,the prediction of landslide susceptibility has become an important step in the process of reducing and mitigating landslide risk.There are many uncertainties in the modeling process of landslide susceptibility prediction.This study has taken Huichang County of Ganzhou City,Jiangxi Province as the research area.A total of 19 environmental factors were selected,including elevation,slope,aspect,plane curvature,profile curvature,terrain relief,surface cutting depth,surface roughness,clay content,sand content,lithology,topographic wetness index,gully density,average annual rainfall,modified normalized difference water index,normalized difference vegetation index,normalized difference built-up Index,total radiation,and road density.In order to solve the problem of selecting the proportion of non-landslide to landslide samples and labelling non-landslide samples in the sample data set,the semi-supervised imbalanced theory is proposed to study the modeling of landslide susceptibility prediction.Considering the errors in the process of collecting and preparing basic data such as landslide inventory data and environmental factors,the qualitative analysis has been changed to quantitative analysis.The error of basic data was quantified,and the vulnerability prediction model was carried out by simulating the error of basic data combined with the semi-supervised imbalanced theory to explore the influence rule of the error of basic data.Finally,according to the relationship between landslide susceptibility index and landslide distribution,a new classification method,frequency ratio threshold method,is proposed to carry out the study of landslide susceptibility zoning mapping.The main research contents and results are as follows:(1)The traditional machine learning landslide susceptibility modeling method was used to predict the landslide susceptibility,obtain the initial landslide susceptibility and conduct zoning mapping.Non-landslide samples were selected from the extremely low and low risk areas according to the ratio of landslide to non-landslide: 1:1,1:5,1:10,1:15,1:20,1:25,and 1:30.The initial susceptibility values of non-landslide samples were labeled into the model training test.Finally,it is found that with the increase of the number of non-landslide samples,the model recognition performance can be improved,but the improvement effect will not always improve with the increase of the number of non-landslide samples.The performance of the model in this study tends to be stable when the ratio of landslide to non-landslide is 1:25.At this time,the accuracy of prediction rate reached 0.886,and the mean and standard deviation were0.373 and 0.245,respectively.(2)After determining the proportion of landslide to non-landslide samples in the landslide susceptibility prediction model,considering the errors in the basic data,the landslide catalogue was set to be missing by 10%,20%,30%,or the landslide catalogue was extended by 10%,20%,30%,and the landslide surface was deviated by30 m.The environmental factors were processed by 1,3,and 5 times low-pass filtering,with a total of 32 error cases.Frequency ratio analysis was performed.The frequency ratio under the error condition of each environmental factor was compared with the original condition,and the error of basic data was quantitatively analyzed.It is found that the error of basic data is the largest when the landslide compilation is expanded by30% and the environmental factors are processed by 5 times low-pass filtering.(3)After obtaining the error results of basic data through quantitative analysis,the error of basic data was simulated and added to the original data to obtain the basic data with error,and the landslide susceptibility prediction model was carried out.The results show that using the basic data with errors for modeling,the mean value of the landslide susceptibility index in the whole region fluctuates around 0.491,which is significantly higher than the mean value of 0.373 in the condition without error added,and the standard deviation fluctuates around 0.151,which is significantly lower than the standard deviation of 0.245 in the condition without error added.The existence of errors leads to the concentrated distribution of landslide susceptibility index tends to be concentrated,which indicates that the errors will interfere with the model recognition ability and lead to the underfitting of the prediction results.(4)In the classification mapping of landslide susceptibility,the frequency ratio threshold method is proposed to classify landslide susceptibility index,and it is compared with the traditional natural breaks method,quantile method and geometric interval method.The results showed that in terms of landslide ratio,the frequency ratio threshold method had the best classification performance,followed by the natural breaks method,and the quantile and geometric interval methods had poor performance. |