Font Size: a A A

An Integrated Prediction Method For Cancer Classification Based On MiRNA Expression Profiles And LncRNA Expression Profiles

Posted on:2020-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2404330623951402Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,cancer mortality has continued to rise and has become one of the most threatening diseases to human life.More and more studies have shown that the occurrence of cancer is inseparable from miRNA and lncRNA.At the molecular level,cancer is caused by abnormal gene expression,resulting in abnormal cell growth,and gene expression is involved in controlling the whole process of cancer.Finding and treating cancer as early as possible can alleviate pain and improve the cure rate.Traditional medicine for cancer diagnosis is generally based on morphology.On the one hand,morphology is highly subjective.On the other hand,morphology is difficult to detect cell malignancy at the early molecular level,resulting in a large number of missed diagnosis and misdiagnosis.With the rapid development of gene chip technology,rapid acquisition of gene expression profile data has become a reality.Using machine learning theory to analyze the gene expression data of human cancer cells and normal cells and constructing a classification model,which can identify cancer cells and achieve the purpose of predicting the occurrence of human cancer.In the study of cancer classification,this paper proposes a new cancer classification integrated prediction method based on miRNA expression profile and lncRNA expression profile.Firstly,miRNA samples with the same sample name were merged with lncRNA samples,and the data set was divided into different training sets and test sets.The multi-theoretical set feature selection algorithm was used for feature selection.In the feature selection segment,lncRNA-miRNA relationship data was introduced for feature correlation,and mutual information was used to remove unrelated genes.In the feature selection stage,a parcel method including genetic algorithm and integrated model is proposed to further feature selection,to achieve the purpose of removing redundancy and searching for the best feature subset.Then,multiple models of a single algorithm are trained on multiple sampling spaces,and the classification ability scores of multiple models of each algorithm on the verification set are calculated as the posterior information of the algorithm,the posterior information of the various algorithms and the corresponding predicted output are combined to make the final decision;finally,the test set is predicted on the overall predictive model,and the classification ability of the model is evaluated based on the results of the ten-fold test.This paper integrates the data construction layer,feature selection layer and prediction model construction layer to form the cancer classification integration prediction method.In this paper,three typical cancers in the TCGA database were studied,including breast cancer,liver cancer,and gastric cancer.The classification accuracy of the three cancers on miRNA and lncRNA co-expression profiles was over 98%.The results of the study show that,in terms of data fusion,the use of miRNA and lncRNA co-expressed spectral data is a good improvement over the use of one of the data classification effects;in terms of feature selection,the lncRNA-miRNA relationship data was introduced,and the feature selection was combined with the filtering method and the encapsulation method so that the selected characteristic genes can well represent the key classification information of the whole gene;in terms of predictive models,multiple models combined with multiple algorithms are used for prediction,which combines the advantages of different algorithms and improves the generalization ability of the overall model.
Keywords/Search Tags:Cancer classification, miRNA expression profile, lncRNA expression profile, Feature selection, Mutual information, Genetic algorithm, Multi-sample set multi-algorithm prediction model
PDF Full Text Request
Related items