Font Size: a A A

Prediction Of Primary Sites Of Metastatic Cervical Carcinoma From Unknown Primary Using Machine Learning Method

Posted on:2020-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:J J JiangFull Text:PDF
GTID:2404330575989584Subject:Surgery
Abstract/Summary:PDF Full Text Request
Metastatic cervical carcinoma from unknown primary(MCCUP)is defined as metastatic disease in the lymph nodes of the neck without any evidence of a primary tumor after appropriate investigation.It is a type of cancer of unknown,accounts for 1-4%of cases.Squamous cell histology is prominent pathological type,accounting for 75-90%.However,despite a comprehensive diagnostic work-up including fibroscopy,computed tomography,magnetic resonance imaging,positron emission tomography,fine-needle aspiration,and panendoscopy,the primary site remains difficult to identify in cases of MCCUP.Therefore,it is extremely urgent to develop a new and effective method to determine the primary sites in MCCUP.The development of high-throughput and next-generation sequencing technologies has improved our understanding of the molecular landscape of cancer,offering the basis and possibility of discovering predictive biomarkers for cancer diagnosis.Relevant high-throughput studies indicate that squamous cell carcinoma(SCC)shares certain common histological characteristics and molecular signatures.This makes it more difficult to identify the primary site of MCCUP,as its pathologic type is primarily SCC.Another high-throughput experiment showed that esophageal squamous cell carcinoma(ESCC)and squamous cell carcinoma of the head and neck(HNSCC)has a strong similarity,and both squamous cell carcinoma are two important potential primary sites of MCCUP.Therefore,in this study,we investigated a new method to identify these two squamous cell carcinomas to assist in the diagnosis of MCCUP primary site.We downloaded the microarray data of esophageal squamous cell carcinoma and squamous cell carcinoma of the head and neck from the public databases,using R import into these microarray data obtain the expression matrix after pre-processing.The limma R package was used to identify differentially expressed genes(DEGs)of ESCC and HNSCC respectively.Intersect function in R was used for identifying the common DEGs of ESCC(GSE20347,GSE23400,and GSE38129)and HNSCC(GSE9844 and GSE23036).Venn function in R were used for identifying common and difference sets between the common DEGs of ESCC and the common DEGs of HNSCC(the common set represents the common DEGs of ESCC and HNSCC,while the difference sets represents their different DEGs respectively).Using the common and different DEGs of these two squamous cell carcinoma,Gene ontology(GO),Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analysis and Protein-protein interaction(PPI)network was performed respectively.Finally,based on HNSCC with ESCC’s own unique DEGs,random selected methods was used for feature selection,and training model with K-nearest neighbor,random forest,support vector machine algorithm to predict and judge the type of tumor tissue.We found that both common and specific genes of these two squamous cell carcinomas had many similarities and differences in the molecular function and pathway enriched in the GO and KEGG pathways analysis,as well as in the protein-protein interaction network(PPI)analysis.Based on the models established by three machine learning algorithms,we used an independent data set to verify and found that the SVM model composed of five genes had the highest accuracyIn this study,we explored the DEGs of ESCC and HNSCC,whether in similarity or in the enriched GO function,KEGG pathway and PPI network.The SVM model consisting of 5 genes can effectively distinguish the two squamous cell carcinomas,which may be beneficial for the accurate diagnosis of MCCUP patients.
Keywords/Search Tags:Metastatic cervical carcinoma from unknown primary, gene ontology, Kyoto Encyclopedia of Genes and Genomes pathway analysis, protein-protein interactions network, Machine learning, support vector machine
PDF Full Text Request
Related items