| The incidence of lung cancer ranks first among various cancers,but early detection of lung cancer can effectively improve the survival rate of patients.Compared with traditional imaging and pathological diagnosis techniques,expiratory analysis based electronic nose technology has drawn great attentions due to its advantages of non-invasive,simple operation and low cost.At present,most of the related studies set healthy samples as control group.However,there are few separability studies on high-risk groups of lung cancer,such as long-term smokers and people with chronic obstructive pulmonary disease.Moreover,there are few studies on the identification of lung cancer staging by electronic nose system.It is worth mentioning that different data analysis methods have significant impact on the recognition effect of electronic nose system.Until recently,the research on data analysis methods for electronic nose is rare.In this study,an electronic nose was used to detect 212 breath samples from 5 groups(healthy smokers,healthy non-smokers,lung cancer patients,chronic obstructive pneumonia patients,patients with other diseases).Those data were analyzed by multiple traditional machine learning methods and one deep learning method.The recognition results of the two kinds methods were compared and analyzed.The main work of this paper is as follows:(1)Research on data analysis methods: two kinds of data analysis methods,i.e.the traditional machine learning method and the deep learning method,was adopted.Traditional machine learning combined 3 different feature extraction methods with 2 classification methods to do pattern recognition of data sets.In deep learning,long-term and short-term memory network(LSTM)algorithm was used.This study compared the traditional machine learning methods with deep learning methods on recognition effect.(2)This paper studied the recognition effect of lung cancer patients and healthy control group by electronic nose.There into,LSTM algorithm obtained the best recognition effect,whose sensitivity,specificity and accuracy were 95.21%,90.77% and 91.03%,respectively.Moreover,84 lung cancer patients with different clinical stages(6 stages II,38 stages III and 40 stages IV)were classified,and the KPCA-SVM algorithm achieved the highest accuracy over 82%.The results showed that the system can recognize the stages of advanced lung cancer to a certain extent.In addition,the high-risk population of lung cancer(such as smoking,chronic obstructive pulmonary disease,etc.)were set as the key screening targets for lung cancer,and this paper also carried out a preliminary study on the distinguishing effect between high-risk population of lung cancer,lung cancer and health.In the classification of high-risk population of lung cancer,lung cancer patients and healthy control group,the recognition accuracy of lung cancer reached more than 86%. |