Research On Fuzzy Clustering Algorithm Based On Cancer Gene Data

Posted on:2024-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:S S Wang

Full Text:PDF

GTID:2544306932459944

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

With the development of high throughput sequencing technology,a large number of cancer gene expression data have emerged.These data cover gene expression data for different types of cancer,which provides great help for cancer research and treatment,and also brings great challenges to the analysis and processing technology of cancer gene data.The challenges of cancer gene data analysis include issues such as high redundancy,polymorphism,high dimensionality,and high noise in data.Therefore,efficient data processing techniques are needed to extract useful information and value from gene data.Cluster analysis,as a common data mining technique,has been applied to the analysis of cancer gene expression data.Due to the complexity of cancer gene expression data,many common clustering algorithms do not perform well in their clustering analysis,and highdimensional data blurs the boundaries of data differentiation.So fuzzy clustering with the idea of fuzzy set is more suitable for cluster analysis of cancer gene expression.Therefore,this paper improves the clustering algorithm to address the problems existing in fuzzy clustering processing of cancer gene expression data.The main research contents are as follows:(1)Aiming at the problem that fuzzy clustering depends strongly on the initial clustering centers and easy to fall into local optimal solutions,a fuzzy clustering algorithm combined with Cauchy distribution and ant lion algorithm(CALOFCM)is proposed.Firstly,the Cauchy distribution function variant ant lion optimization algorithm is introduced,which reduces the binding force of individuals by local extreme points,thus increasing the probability of escaping from the local optimum.Secondly,the elite ant lions generated by the optimized ant lion algorithm are used as the initial clustering centers of the Fuzzy C-Means(FCM)algorithm.Finally,the comparison experiments of UCI data sets and cancer gene expression data sets show that compared with k-means,DBSCAN,FCM,ALOFCM algorithm,the proposed algorithm can escape from the local optimum and obtains better clustering effect.(2)Aiming at the characteristics of large amounts of cancer gene expression data and large amounts of redundant information.This paper proposes a weighted cancer gene fuzzy clustering algorithm based on Fisher linear discrimination(FLDAFCM).Firstly,Fisher linear discriminant analysis is introduced.And the contribution rate of the attribute to the sample data is determined using the Fisher linear discriminant rate.Then calculate the weight formula and improve the fuzzy clustering algorithm.Finally,experimental verification was conducted on the UCI dataset and the cancer gene expression dataset.And compared with FCM,DBSCAN,and the CALOFCM and FLDAFCM algorithms proposed in this paper.The experimental results show that the weighted cancer gene fuzzy clustering algorithm combined with Fisher linear discrimination has better clustering effects on high-dimensional data,and through the clustering analysis of the data set,the clustering results with medical value are obtained.

Keywords/Search Tags:

Gene expression data, Fuzzy clustering, Optimization algorithm, Cauchy distribution, Fisher linear discrimination

PDF Full Text Request

Related items

1	The Fuzzy Clustering Algorithm Research Based On Cancer Gene Data
2	Gene Expression Clustering Analysis Method
3	Applying Vertex Coloring Algorithm Based On Fuzzy Clustering To Identify Candidate Genes Associated With Alzheimerâ€™s Disease
4	Research On Weighted Clustering Algorithm Based On Tumor Gene Expression Data
5	Research On Brain MR Image Segmentation Algorithm Based On Fuzzy C-means Clustering
6	Research On Density Peaks Clustering Algorithm Based On Tumor Gene Expression Data
7	Research On Algorithm For Medical Image Segmentation Based On Fuzzy Cluster And Cuckoo Optimization
8	Research On Anomaly Detection For Medical Insurance Record Based On Improved Fuzzy Clustering Algorithm
9	Research On Cancer Subtype Clustering Algorithm Of Gene Expression Profile Data
10	Clustering Algorithm And Disease Association Research For Genetic Data