Prompt early diagnosis of gastric cancer can effectively decrease the incidence and mortality rates of gastric cancer.The currently available clinical diagnostic methods for early diagnosis of gastric cancer have shortcomings such as high subjectivity,cumbersome and time-consuming test procedures,and tendency for missed diagnosis.Therefore,development of an objective and rapid method that can be used to accurately for early diagnosis of gastric cancer has important clinical application value and scientific research significance.In this paper,the feasibility of fluorescence hyperspectral imaging combined with machine learning for early diagnosis of gastric cancer was discussed.In vitro gastric mucosal tissues with a confirmed diagnosis by histopathological examinations were divided into 3 groups,the non-precancerous lesion group(n=53),precancerous lesion group(n=35),and gastric cancer group(n=32)which collected from 120 patients in the Department of Gastroenterology and Endocrinology,the 74 th Group Army Hospital of People’s Liberation Army and Department of Gastroenterology,Zhujiang Hospital of the Southern Medical University.The main research methods and ideas of this paper are as follows: First,fluorescence hyperspectral imaging technique was used to acquire fluorescence spectral images at a range of450-680 nm.The images were binarized before extracting the region of interest.Then,the fluorescence spectra of 100 pixels were randomly collected,and differences and changes in fluorescence components in the non-precancerous lesion,precancerous lesion,and gastric cancer groups were analyzed.Thereafter,conventional machine learning methods were combined with spectral characteristics to construct an early diagnosis model of gastric cancer(including non-precancerous lesion;precancerous lesion and gastric cancer groups).Second,the spectral-spatial classification method was combined with spectral and spatial information.At the same time,ensemble learning methods was combined with the "spectral+spatial" features that were extracted by hand to construct an early diagnosis model of gastric cancer and compared with the model created by ensemble learning combined with the spectral features.Finally,deep learning methods was used to automatically identify and extracted the "spectral+spatial" features to construct an early diagnosis model of gastric cancer.The results of spectral analysis and machine learning modeling were as follows:(1)Fluorescence spectral analysis results: The fluorescence peak at 496 nm had a greater intensity in gastric cancer tissues while the fluorescence peaks at 640 and 670 nm were weaker,which may be due to schiff base of pyridoxal phosphate and hematoporphyrin IX metabolic abnormality in malignant tissues.Fluorescence at 546 nm corresponds to that emitted by phospholipids in the body.The fluorescence emission peak intensity was in the following order:non-precancerous lesion group > precancerous lesion group > gastric cancer group.Aberrant phospholipid metabolism reflects the severity of lesion progression.These differences show the changes in fluorophore composition in tissues during tumor transformation.The endogenous fluorescent substances can be used as spectral biomarkers to distinguish non-precancerous lesions,precancerous lesions,and gastric cancer.However,this method is greatly affected by spatial distribution differences in tissue components,individual differences,and device noise,which result in the fluorescence spectral overlap more serious,and makes it prone to misjudgment.(2)Model results based on conventional machine learning(Partial Least Square-Discriminant Analysis,PLS-DA;Support Vector Machine,SVM): PLS-DA results showed that the accuracy,specificity,and sensitivity were 81.2%,79.8%,and 86.3%,respectively,for the non-precancerous lesion group compared with the precancerous lesion group;87.5%,87.5%,and 89.9%,respectively,for the non-precancerous lesion group compared with the gastric cancer group;and 82.1%,95.3%,and 76.4%,respectively,for the precancerous lesion group compared with the gastric cancer group.The SVM results showed that the accuracy,specificity,and sensitivity were 92.8%,94.0%,and 90.9%,respectively,for the non-precancerous lesion group compared with the precancerous lesion group;95.6%,96.5%,and94.3%,respectively,for the non-precancerous lesion group compared with the gastric cancer group;and 94.9%,94.7%,and 95.8%,respectively,for the precancerous lesion group compared with the gastric cancer group.The accuracy of early diagnosis model for gastric cancer were all above 92% constructed by SVM,which was significantly better than that of PLS-DA.(3)Model results based on ensemble learning(e Xtreme Gradient Boosting,XGBoost): The XGBoost combined with spectral-spatial classification method and model results showed that the accuracy,specificity,and sensitivity were 92.0%,90.6%,and 93.4%,respectively,for the non-precancerous lesion group compared with the precancerous lesion group;91.5%,90.3%,and92.5%,respectively,for the non-precancerous lesion group compared with the gastric cancer group;and 91.3%,90.5%,and 92.1%,respectively,the precancerous lesion group compared with the gastric cancer group.The XGBoost combined with spectral classification method and model results showed that the accuracy,specificity,and sensitivity were 91.6%,88.4%,and 90.4%,respectively,for the non-precancerous lesion group compared with the precancerous lesion group;95.8%,88.4%,and 92.7%,respectively,for the non-precancerous lesion group compared with the gastric cancer group;and 93.9%,93.6%,and 93.7%,respectively,for the precancerous lesion group compared with the gastric cancer group.The accuracy of early diagnosis model for gastric cancer were all above 91% constructed by XGBoost,but the model performance base on spectral-spatial classification method did not show much improvement when compared with spectral classification method.(4)Model results based on the deep learning(Res Net-34)algorithm: The model results of Res Net-34 combined with spectral-spatial classification method showed that the overall accuracy for the non-precancerous lesion,precancerous lesion,and gastric cancer groups was 96.5% and specificities of 96.0%,97.3%,and 96.7%,and sensitivities of 97.0%,96.3%,and 96.6%,respectively.The comparison showed that the specificity and sensitivity of the non-linear model(SVM/XGBoost/Res Net-34)were significantly better than those of the linear model(PLS-DA).Spectral and image data all have significant non-linear effects which affected by device,environment,individual differences,and the non-linear model can more effectively extract category information which related to the lesion.The model results of deep learning are better that compared with conventional machine learning and ensemble learning.Therefore,the early diagnosis model of gastric cancer constructed through fluorescence hyperspectral imaging combined with machine learning can provide better reference information for early diagnosis of gastric cancer.In addition,the deep learning combined with spectral-spatial classification method can increase the diagnostic accuracy and is expected to be a new method for the early diagnosis of gastric cancer. |