Font Size: a A A

The Research Of Colorectal Cancer Risk Prediction Model Based On Dimensionality Reduction And Regression Analysis

Posted on:2018-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:C Q ZhengFull Text:PDF
GTID:2334330536473575Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
At present,colorectal cancer(CRC)has become one of the most common malignant tumors in the world.About 1200 thousand new cases are reported worldwide each year,and around 600 thousand people die from the disease.In the past few decades,the incidence of CRC in China is at a low level,but in recent years,with the improvement of people’s living standard,food structure,lifestyle changes,the average life expectancy was significantly prolonged in China,CRC incidence rate and mortality rate increased year by year.In addition to its high global morbidity,CRC also has a high mortality rate.In fact,the 5 year survival rate for patients with early CRC is over 90%.However,because of its relatively insidious onset,more than 60% of the patients in the clinic are in the middle and late stages.Local metastasis has been diagnosed,and the 5 year survival rate of CRC has dropped to 68%.CRC patients with distant metastasis have an annual survival rate of only 11%.In order to diagnosis and treatment of CRC,scientists did a lot of research,but the etiology and pathogenesis is still not completely clear.Although a large number of epidemiological studies showed that the occurrence of CRC is a complicated process.In this process,it will be influenced by environmental factors and genetic factors.However,it is not clear what environmental factors and genetic factors affect the development of CRC.Therefore,it has great significance to explore the risk factors of CRC and predict the incidence of CRC for early diagnosis and early treatment of CRC.In this paper,a multi-level prediction model of CRC was established by using the biological classification,data reduction and regression analysis.In addition,when we established a multi-level predictive model,we innovatively proposed Generalized Kernel Recursive Maximum Correntropy Algorithm(GKRMC),it is a nonlinear regression method to improve the prediction precision and accuracy of CRC.The specific work of this paper includes:(1)This paper presents a GKRMC regression method.We proposed a nonlinear regression method to predict the the incidence of CRC in the regression analysis stage.First introduces the basic concepts related to entropy,the proposed GKRMC algorithm from the basic maximum relative entropy theory and kernel recursive lease squares(KRLS),so the prediction model can better train noise samples,and it has strong anti-noise ability.(2)A multilevel prediction model for CRC was established.The whole process is introduced through three modules.The processes were as follows: biologicall classification,data dimensionality reduction and regression analysis.First in the analysis of biological classification,combined with the depth of the biological knowledge to classify the experimental data,the data is divided into four categories: genetic information,demographic characteristics,lifestyle and food,which is more close to real difference between the gene polymorphisms and environmental factors.Then,the dimensionality reduction model was established to screen out the relationship between CRC and the features with significant differences.And then explains the regression process,the predictor of logistic regression,support vector machine,KRLS and GKRMC,the accuracy of measures results were analyzed to verify the GKRMC algorithm has better prediction ability than traditional algorithm.(3)Experimental results and analysis.In the experiment part,the algorithm is implemented and the important experimental process and data are emphasized.At the same time,the comparative experiments are carried out to show the superiority of the GKRMC algorithm.First,we summarized the results of biomlogical classification,and then show the results of data reduction,and finally compared the traditional method,and proved the advantage of GKRMC algorithm in predicting the accuracy of CRC.Based on the above work,we explored the relationship between environmental factors,gene factors and risk of CRC.The results showed that:(1)environmental factors and genetic factors plays an important role in the pathogenesis of CRC;(2)we use screened these biomarkers as regression model classifier can accurately and efficiently identify each individual risk of CRC;(3)we proposed the prediction ability of GKRMC algorithm is better than the traditional regression method.
Keywords/Search Tags:Colorectal cancer, Environment factor, Gene polymorphism, Dimensionality reduction, Predictive model
PDF Full Text Request
Related items