Font Size: a A A

The Applications Of Machine Learning Algorithms In Bioinformatics Analysis Of Lung Cancers And Fault Detection And Diagnosis

Posted on:2020-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:M R u i F e l i z a r d o Full Text:PDF
GTID:2480306131964719Subject:Chemical Engineering and Technology
Abstract/Summary:PDF Full Text Request
While tobacco exposure is the cause of the vast majority of lung cancers,an important percentage arise in a lifetime never smokers.Documenting the precise extent of tobacco-induced molecular changes may be of importance.Also,the contribution of environmental tobacco smoke(ETS)is difficult to assess.We developed and validated a quantitative method to assess the extent of tobacco-related molecular damage by combing the most characteristic changes associated with tobacco smoke,the tumor mutation burden(TMB)and type of molecular changes present in lung cancers.Using maximum entropy(Max Ent)as a classifier,we developed an F score.F score values >0were considered to show evidence of tobacco-related molecular damage,while values?0 were considered to lack evidence of tobacco-related molecular damage.Although,our main contribution to this work is related to data downloading and pre-processing.Process monitoring using Kernel principal component analysis(KPCA)have been reported.The goal of this research is to improve the fault detection and diagnosis of KPCA,where the confidence limits were obtained through the KDE method.The conventional KPCA depends on the choice of the width parameter selected empirically in Gaussian kernel function.Evidently,a single kernel function with explicit width may not be effective enough to detect different types of faults.So,if a poor Gaussian kernel function is selected,it may degrade the detection performance.Different faults may need different width values to maximize their monitoring performance.To address these issues,we incorporate an ensemble learning approach with Bayesian inference into KPCA.This approach overcomes the drawbacks faced when applying a single KPCA not only on the robustness to the width parameter selection but it improved significantly the monitoring performance which can be seen through Figures presented along with the research.The contribution plot method was implemented to diagnosis the root caused after detecting a fault through Ensemble Kernel principal component analysis(EKPCA-Bayes)which the root cause variables were successfully identified.
Keywords/Search Tags:Machine learning, lung cancer, pre-processing, Fault detection and diagnosis, KPCA-KDE, Ensemble learning, Bayesian Inference, Contribution plots
PDF Full Text Request
Related items