| Colorectal cancer is one of the most widespread cancers in the world.Due to its slow progression,early diagnosis plays a key role in whether it can be cured by surgery.Endoscopy is currently the gold standard for detecting precancerous lesions,but its shortcomings such as strong invasiveness and complicated procedures prevent endoscopy from becoming a means of early screening for large-scale populations.The liquid biopsy method based on Raman spectroscopy can identify the molecular and cellular changes in serum caused by cancer occurrence or early progression,and has the advantages of non-invasive,simple and efficient.However,the composition of serum Raman spectra is complex,the signal-to-noise ratio is low,and the degree of data standardization is low,which makes the existing algorithms low in accuracy,insufficient in interpretability,and questionable in generalization ability.Therefore,on the basis of optimizing the deep learning classification model,this paper further improves the performance of the model through transfer learning,and verifies it in the constructed data set,achieving high-precision early diagnosis of colorectal cancer.The main research contents of this paper include:(1)Construct a Raman spectral dataset based on colorectal cancer and preprocess it.41 groups of cell line culture fluid and 153 patient serum were collected,and surface-enhanced Raman spectroscopy(SERS)and Raman spectroscopy(RS)were produced based on the samples of cells and serum for follow-up research.In view of the common noise and fluorescence background in Raman spectral data,the Raman spectrum is preprocessed by data cleaning,smoothing and denoising,baseline correction,spectral frequency shift range interception and normalization.The spectrum and processing methods used do not require prior knowledge of the spectrum,and reduce noise interference while retaining the original Raman spectrum characteristics,achieving the purpose of strengthening effective information and improving model training efficiency.(2)Propose a cell Raman spectrum classification method based on self-attention mechanism.Aiming at the problems of insufficient modeling ability of existing methods,a general Raman spectral classification model MS-Former based on self-attention mechanism is proposed,so that it can handle high-noise RS data,thereby reducing the data collection requirements for clinical applications.The proposed model achieved classification accuracies of 99.12% and 90.88% in the cellular SERS and RS datasets,respectively.Aiming at the problem of poor interpretability of deep learning models,Grad-CAM is used to visualize the spectral contribution,which provides intuitive insights for model decision-making,and verifies the effectiveness of the model’s attention to spectral positions by extracting key spectral frequencies.(3)Propose a serum Raman spectrum classification method based on domain adaptation.Aiming at the problem of complex composition of serum samples and insufficient generalization ability of the model,an early diagnosis framework for colorectal cancer is proposed,which can complete large-scale population screening through Raman spectroscopybased serum testing.Among them,the transfer learning method based on the improved DANN can effectively improve the data utilization efficiency and model generalization ability,and achieve a classification accuracy rate of 96.25% on the serum dataset.The experimental results further prove that the results based on transfer learning and Raman spectroscopy The early diagnosis method of rectal cancer has the potential of clinical transformation. |