Font Size: a A A

Analysis And Application Of Single-Cell Data Based On Machine Learning Algorithm

Posted on:2022-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhaiFull Text:PDF
GTID:2517306491960299Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the development of The Times,single-celled RNA sequencing technology enables researchers to analyze the heterogeneity of tissues or organs at the level of single cells.In analyzing single-celled RNA sequencing data,two important steps are cell clustering and annotation,as they can assist in subsequent analysis of intergene regulatory relationships.However,due to the sparse,high-dimensional and high-noise sequencing data obtained,single-cell clustering and annotation become very challenging,which will be hindered to some extent in both accuracy and operational e ciency.In addition,due to previous e?orts,we can obtain many annotated data sets,which can assist our single-cell clustering and annotation steps,greatly saving the cost of manpower and resources.For unsupervised single-cell clustering,the sczi Desk model is proposed in this paper,which is an end-to-end framework for simultaneous denoising,dimensioning and clustering of single-cell RNA data.We first use the zero-inflated negative binomial distribution to characterize the single-celled gene expression data,and learn the low-dimensional manifold space where the parameters are located through nonlinear denoising autoencoder.In the obtained hidden space,we propose a selftraining soft K-means clustering algorithm.The self-training step can e?ectively aggregate similar cells,thus improving the compactness within the class and the separability between classes.Our method achieves better results than other singlecell RNA-seq clustering algorithms in both simulated and practical data.Our approach is also very scalable on large data sets.For single-cell semi-supervised annotation,this paper proposes a sc An Cluster model,which is an end-to-end supervised clustering and annotation framework integrating supervised learning,self-supervised learning and unsupervised learning.In this model,we follow the ideas of simultaneous denoising,reduction and clustering in sczi Desk,and propose a dynamically changing pseudo-labeled method to integrate the reference set of known cell tags and the target set to be commented.Through a large number of simulation data and actual data experiments,we verify that sc An Cluster is an e?ective and robust single-cell supervised clustering and annotation algorithm.More importantly,it can identify potential novel cell types in the target set,which has very important practical application significance.
Keywords/Search Tags:Cell Clustering, Cell Annotation, Denoising Autoencoder, Soft K-means Clustering, Pseudo-labeled Method
PDF Full Text Request
Related items