Font Size: a A A

Research On Weighted Clustering Algorithm Based On Tumor Gene Expression Data

Posted on:2022-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:N MaFull Text:PDF
GTID:2504306341986649Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,it has been the focus of bioinformatics to distinguish different tumor types by clustering tumor gene expression data.In the face of increasing cancer genes,it is one of the problems to explore the correlation between cancer.Precision medicine tries to play an auxiliary role by clustering algorithm.Accurate classification plays a very important role in adjusting the treatment scheme in medicine.Of course,for clustering algorithm,it still has some problems.Slowly,fuzzy clustering is developing faster and faster,more and more people are studying.Therefore,fuzzy c mean clustering algorithm(fuzzy c mean clustering algorithm,FCM)can express the characteristics of tumor gene expression data in dynamic life system.According to the problems of FCM algorithm in data set,this paper studies and applies it under common data set,gene expression data set and tumor gene expression data set.The main contents are as follows:(1)To solve the problem of local consistency between data points in the similarity measure based on Euclidean distance,FCM Jeffery divergence similarity measure and FCM algorithm are combined and improved.This paper proposes a clustering algorithm based on Jeffery divergence similarity measure(weighted FCM clustering algorithm based on Jeffrey divergence similarity measure,JW-FCM).The improved objective function is obtained by theorem and inference.in the experimental part,the algorithm is compared with k-means、DPC、FCM classical algorithm under the artificial data set.it is proved that the algorithm can improve the clustering effect,and the accuracy is compared.in order to better reflect the role of the algorithm in practical application,convergence analysis is carried out under three cancer data sets.Finally,it is proved that the proposed algorithm has better clustering effect,accuracy and convergence.(2)In order to improve the FCM algorithm,we can not distinguish the different attributes of gene expression data from the different contributions to clustering under high-dimensional gene expression data sets.This paper proposes a feature weighting FCM algorithm(feature weighted FCM algorithm based on preprocessing result property reduction,RW-FCM)based on attribute reduction of preprocessing results.The feature weights are obtained by data preprocessing and feature weighting,and the Relief F technology is combined with the FCM algorithm to verify the algorithm by comparing the convergence and clustering effect accuracy under the data set.In brief,two improved algorithms are proposed in this paper JW-FCM、RW-FCM and clustering experiments are carried out under different data sets,and feature weighting is carried out according to the difference of tumor gene expression data.The clustering quality of the algorithm under the data set is evaluated by using the data set is used.In the application part,the algorithm proposed in this paper is applied to the tumor gene expression data set,which demonstrate the means camed up with in this thesis is resultful and viable.Also shows the advantages of JW-FCM、RW-FCM algorithm,in the practical biological category has important significance.
Keywords/Search Tags:DNA microarray, Density Peak Clustering, Bat algorithm, Entropy Weighting K-Means Algorithm for Subspace Clustering, Tumor subtype clustering
PDF Full Text Request
Related items