Font Size: a A A

Research On Geochemical Abnormity Identification Of Metric Learning And Random Forest

Posted on:2021-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ChengFull Text:PDF
GTID:2370330647464221Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the increasing amount of information,the accuracy of the classification algorithm has been improved day by day.Machine learning is an interdisciplinary subject combining multiple fields and aspects.As a cutting-edge method in data analysis,it has been widely used in classification,information extraction and other aspects,so as to achieve the purpose of acquiring skills and knowledge and have the ability of classification and judgment.All these algorithms need to measure the similarity between samples.How to measure the similarity between samples accurately and achieve better classification effect is a direction worth learning and exploring.As a non-renewable resource,the exploration difficulty of mineral resources is increasing year by year.Geochemical anomaly recognition is an important task in prospecting.This task can be thought of as a binary classification problem,the purpose of which is to distinguish the background of the exception.The traditional methods of geochemical anomaly recognition are more and more limited,and more comprehensive methods and techniques have become a new research direction.Metric Learning(ML)makes it possible to learn the distances of complex distributed data from tagged data.In this study,a classifier based on Random Forest(RF)is used to support the distance function,and the absolute position and relative position are combined into the representation.Combining measurement learning and random forest,the random forest is taken as the underlying representation of measurement learning,and a more appropriate distance is established to evaluate the similarity between samples,making the classification more accurate.In this paper,the statistical information between geochemical characteristics in the limited training samples is used to establish a more suitable distance to evaluate the similarity between samples.On this basis,random forest is used to separate geochemical anomalies from the complex background.Specific work in the following aspects:(1)A hybrid model combining random forest and measurement learning(ML-RF)was introduced to explore the internal relationship between samples and separate similarity and dissimilarity,aiming at a single markov distance that could not handle heterogeneous data well.A new mapping function is used to replace the traditional markov matrix,which avoids the restriction of forming pairs of positions in the markov distance.The ML-RF classification model based on measurement learning as the underlying framework and random forest as the underlying representation.The ML-RF method uses random forest as the underlying representation of measurement learning,avoiding the defect of sensitivity to rare prior samples.(2)Collect data sets,apply ITML(Information Theroy Metric Learning),LMNN(Large Margin Nearest Neighbor),Mahalanobis,and the ml-rf model proposed in this paper to UCI data for classification simulation experiment,and compare several classification accuracy effects.(3)Sampling and data collection were conducted in the research area.All geochemical composition data of 39 elements were processed.Factor analysis was used to process the data and evaluate the element combinations related to the mineralization process.(4)ML-RF was applied to the exploration geochemical data in the research area to identify geochemical anomalies.After training and testing,the classification effect of RF was compared.The comparison of anomaly plots by ROC curve showed that the classification effect was significantly improved,and the geochemical anomaly plots and metallogenic prospective areas were drawn.
Keywords/Search Tags:machine learning, Random forest, Measure learning, Exploration geochemistry, Geochemical anomaly recognition
PDF Full Text Request
Related items