Font Size: a A A

Research On Single-cell RNA-seq Data Mining Based On Deep Learning

Posted on:2022-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2480306548496954Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the development of single-cell RNA sequencing(scRNA-seq)technology,a huge amount of sequencing data has emerged.The mining of gene expression from scRNA-seq sequencing data through computational methods has an important guiding role in the construction of gene regulatory networks,embryonic development,and brain neurological research.The analysis of scRNA-seq data can also provide important basis for drug development and clinical medicine.With the rapid development of deep learning,mining single-cell RNA-seq data based on deep learning has become a research hotspot in bioinformatics.This topic is mining scRNA-seq data information based on deep learning,and the main research contents are as follows:1.We propose a data mining method for scRNA-seq data based on bi-autoencoder,which is called sc-bi AE.First,the classical autoencoder is employed to impute the dropout events in the scRNA-seq data,and the process of data reconstruction via autoencoder can reduce the analysis deviation caused by high sparsity ratio.Secondly,denoising autoencoder could remove data noise and perform dimensionality reduction.Then the low-dimensional subspace reflecting the essential data pattern is obtained.Finally,we use the K-means clustering to test the dimensionality reduction performance of sc-bi AE.The results of four commonly used indexs show that sc-bi AE has better dimensionality reduction performance than other ten currently popular methods on the public scRNA-seq datasets.To further test the dimensionality reduction performance of sc-bi AE,we analyze the clustering results of the model under the parameters of different dimensions,different training batches and different loss rates,and make the sc-bi AE achieve optimal dimensionality reduction performance.The results show that sc-bi AE combined with two different autoencoder networks in deep learning can not only reduce the impact of dropout events,but also further remove the redundant information in scRNA-seq data,providing accurate information for clustering analysis.Even sc-bi AE is expected to provide help for dimensionality reduction of other biomedical data.2.We propose a clustering method for scRNA-seq data based on Gaussian mixture model and autoencoder,called sc GMAI.First,it uses autoencoder network in deep learning to reconstruct the original scRNA-seq data to reduce the impact of dropout events.Then,Fast Independent Component Analysis is used to remove redundant information and obtain the latent space representing the essential characteristics of scRNA-seq data.Finally,we construct a clustering model based on Gaussian mixture method to accurately cluster and identify cell types for scRNA-seq data.By comparing four clustering indexs,the clustering performance of sc GMAI is better than the stateof-the-art clustering methods on the public scRNA-seq datasets.To further test the performance of sc GMAI,we draw the clustering visualizations,gene expression heatmaps and differential expression genes signals analysis images.The results show that the clustering model for scRNA-seq data based on deep learning and Gaussian mixture methods can not only accurately identify cell types from scRNA-seq data,but also provide accurate information for downstream analysis of scRNA-seq such as differential expression.More importantly,the model can apply to large-scale scRNAseq datasets(for example,100,000 cells).
Keywords/Search Tags:deep learning, scRNA-seq, dimension reduction, clustering, classic autoencoder, denoising autoencoder
PDF Full Text Request
Related items