Unsupervised Discriminant Analysis For Single Cell Transcriptomes Data And Its Application

Posted on:2023-03-08

Degree:Master

Type:Thesis

Country:China

Candidate:Q R Peng

Full Text:PDF

GTID:2530306842970199

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Single cell RNA sequencing(scRNA-seq)technologies enabled the measurement of expression at individual cell level,providing a data basis for studying complex diseases and life activities at single-cell resolution.However,the data biases due to the technical limitations,such as high noise and sparsity,have posed huge computational challenges for the development of analytical methods.In order to overcome scRNA-seq data biases,this study proposes scDA,a selfrepresentation learning model embedded in feature extraction for cell type recognition and annotation by leveraging the interdependency between extraction of molecular features and learning of sample relationships.This method uses dimensionality reduction technique and sample self-representation to unify the two different tasks of feature extraction and sample relationship learning into a single mathematical model.Thus,scDA accurately learns the cell-cell representation matrix and the corresponding metagene discrimination matrix,which can be used for the research tasks such single cell clustering and annotation.To validate the effectiveness of the proposed method,we performed two types of benchmark studies,namely small scale and large scale.On the small-scale datasets,scDA achieved significantly improved clustering accuracy compared with other methods,and the ability of the corresponding discriminant matrix to distinguish diverse cell types is analyzed and discussed.Then,the scDA method was applied to large-scale datasets for cell type annotation.We verified that scDA can accurately label a large number of cells by training the model with a small number of cells even without the prior guide of cell annotations provided by data authors,thus indicating scDA of strong applicability to large-scale datasets.Finally,we applied scDA to scRNA-seq datasets of different platforms or sources,for example,the human pancreatic scRNA-seq dataset with obvious batch effect and the human bone marrow scRNA-seq data from different subjects.The results showed that scDA could accurately distinguish six cell types that differ in cellular abundance across multiple pancreatic datasets.While on bone marrow scRNA-seq dataset,the discriminant matrix learned by scDA could help illustrate the differentiation structure between four cell lineages specified in the dataset,which could be further confirmed by the high expression of known marker genes,thus revealing the discriminant metagenes of well biological interpretations.Therefore,on the basis of applications in medicine research,the scDA method can overcome batch effects between sequencing protocols or sources,providing strong support for cell type recognition and visualization studies.In summary,this study proposes the scDA method and the scDA centered single cell data analytical pipeline for cell clustering and cell type annotation.We evaluated its performance on the two tasks using series of small scale and large-scale benchmark datasets.We applied scDA to cross platform and source single-cell datasets,and demonstrated that the scDA method can overcome the influence of confounding or batch factor in real-world research and provide accurate cell type recognition,visualization and interpretability.It is further proved that scDA has strong practical application value.

Keywords/Search Tags:

single cell transcriptome, subspace clustering, discriminant analysis, cell representation, discriminative genes

PDF Full Text Request

Related items

1	Subspace Clustering Based On Trace Group Lasso And Its Application To Single-cell RNA-Seq
2	Cell Clustering And Differentially Expressed Gene Analysis Based On Large-scale Single-cell RNA Sequencing Dat
3	Single Cell RNA-seq Clustering Method Based On Self-renewal Of Cell Relationship Matrix
4	Research On Denoising And Clustering Methods For Single Cell Transcriptome Data
5	Transcriptome Analysis Of Embryo Sac Component Cells And Cell-type-specific Gene Screening In Arabidopsis Thaliana
6	Cell Type Based Cell-Cell Communication Pattern And Application
7	Dissecting Differentiation And Maturation Processes Of Human Terminal Erythroblast In Single-Cell Resolution
8	Single-cell Clustering Method Based On Consensus Strategy Evaluation
9	Single-cell Transcriptome Anaylsis Of Two-cell Stage Mouse Embryos
10	Efficient Clustering Algorithm For Large-Scale Single-Cell Transcriptome Data