Font Size: a A A

The Application Of SVM Classifier Based On Parameter Optimization To Early Lung Cancer Diagnosis

Posted on:2015-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:D ShangFull Text:PDF
GTID:2284330431993621Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
This paper first introduces the history and the current situation of thedevelopment of lung cancer diagnosis, based on this, the method combined supportvector machine (SVM) with early diagnosis of lung cancer is put forward, whichrealizes the application of machine learning to real data classification. Support vectormachine (SVM) is a classifier which has highly classify accuracy and goodcapabilities of fault tolerance and induction ability. It can solve the classificationproblem with small samples, nonlinear and high dimensions who has strongpracticability. But in the application of SVM, the selection of kernel function,kernelparameter and punish coefficient has a great influence on the result. My paper firstload the commonly used fisheriris data set to compare the Classification performanceintuitively of the polynomial kernel function and the RBF kernel function. Then usethe meshing method to find the best parameters c and g. In order to improve theclassification effect, the genetic algorithm (GA) and particle swarm optimization(PSO) is respectively used to optimize the parameters in this paper. The optimizedSVM algorithm is used in the classification of the lung cancer data.At last, the resultsare compared with the current popular classification methods: the decision tree C4.5algorithm and fuzzy neural network algorithm, and display the classificationperformance of all the algorithms in the ROC space.Methods: On the basis of5clinical parameters and21radiologicalcharacteristics extracted from chest CT,117cases of samples are randomly dividedinto training set and test set. Normalize the data and use PCA to reduce thedimensions. Then train the SVM network,choosing the RBF kernel function, firstusing the commonly used meshing method to choose the appropriate kernel parameterg and punish coefficient c. And then use the test set to test the ability of the networkto distinguish between lung cancer and not lung cancer. Then use GA and PSO tooptimize the parameters respectively, and repeat the process before. Every method ofparameter selection is conducted in the sense of K-CV. Finally compare the resultsof various methods, including C4.5algorithm and fuzzy neural network algorithmwhich have been tried in the course of algorithm exploring. Find the advantages and disadvantages of various algorithms and choose the most appropriate.Results: through the output results of the test set, we find that the false alarmand the miss rate of optimized SVM network is lower than the former.And theclassification accuracy is improved. The PSO optimization method is best. In44cases of test samples, there are3cases of mistakes.(False positive5, False negative36,38). The AUC is the biggest under ROC curve. GA optimization method isfollowed,4cases of mistakes in44cases of test sample. The TMF FNN got5casesof mistakes, the GMF FNN got4mistakes and the C4.5algorithm is the worst.What’s more, the PSO optimization method is not sensitive to the change of casessamples of group, which has a better generalization ability and a faster operationspeed. As a result, the PSO optimization method to the SVM network is more suitablefor the lung cancer diagnosis and worthy of further research.
Keywords/Search Tags:support vector machine (SVM), lung cancer diagnosisgenetic algorithm, Particle Swarm Optimization, kernel function, penaltyparameter, fuzzy neural network, C4.5algorithm, data classification
PDF Full Text Request
Related items