Font Size: a A A

Research On Dimensionality Reduction And Clustering Method Of ScRNA-seq Data Based On Deep Learning

Posted on:2023-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q RenFull Text:PDF
GTID:2530307097479124Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The development of single-cell RNA sequencing(sc RNA-seq)technology has led to a deeper understanding of cellular heterogeneity.Currently,as single-cell RNA sequencing data become increasingly available,a strong data base has been established for determining the types of various cell populations within biological tissues and for large-scale determination of biological cell heterogeneity.Through dimensionality reduction and clustering of single-cell RNA sequencing data,people can not only extract key features of the data,but also achieve the purpose of determining biological cell types,and it provides an important basis for us to reveal the working mechanism of cellular immune function and elucidate the formation of tumor cells.However,because sc RNA-seq data itself has the characteristics of high dimensionality,high technical noise and many zero values,it is of great significance for the development of bioinformatics to construct an accurate and efficient dimensionality reduction and clustering model.Today,with the simultaneous development of deep learning and single-cell technology,the use of neural network architectures to analyze large-scale single-cell data has become a current research hotspot.In this paper,based on deep learning,the dimensionality reduction and clustering of sc RNA-seq data information is studied,and the selected representative sc RNA-seq data set is taken as the research target.The main research contents are as follows:(1)In view of the high dimensionality and strong sparsity of single-cell RNAsequencing data,an innovative SCAVAE model is proposed for dimensionality reduction of sc RNA-seq data.The model combines variational autoencoders and adversarial autoencoders;The model first uses a variational autoencoder to reduce the dimensionality of the sc RNA-seq data,and uses the loss function of the zero-inflated negative binomial distribution and the MSE loss to reconstruct the input data;Then,using the idea of the adversarial autoencoder based on the adversarial network,a layer of discriminator is added to the latent space after dimensionality reduction to obtain the low-dimensional space representing the most essential features of the data;Finally,the k-means clustering method is used in the latent layer to cluster the dimensionalityreduced data to test the dimensionality reduction performance of the SCAVAE method.Experiments on three single-cell RNA-sequencing datasets demonstrate that this method is able to obtain a more accurate representation of the potential features of sc RNA-seq data.(2)In order to further solve the problem of high noise and low clustering accuracy of single-cell RNA sequencing data,a sc ADAEDC model based on the combination of deep denoising autoencoder and adversarial autoencoder is proposed.The model uses a combination of deep denoising auto-encoder and adversarial self-encoder for denoising and dimensionality reduction of sc RNA-seq data while innovatively proposing a clustering method in the latent space using the Louvain algorithm combined with a deep embedding iterative optimization clustering algorithm.This method not only mitigates the effect of noise and removes redundant information from sc RNA-seq data,but also improves the performance of clustering.The experimental results on three sc RNA-seq datasets from different sequencing platforms show that this method exhibits good performance.
Keywords/Search Tags:Deep learning, single-cell RNA sequencing data, autoencoder, dimensionality reduction, clustering
PDF Full Text Request
Related items