| In the past,people have studied the expression patterns of biomolecules using the average expression value of multiple cells in a sample.However,even in different cells in the same sample,there are huge differences in the expression of biomolecules.In recent years,with the development of single-cell sequencing technology,accurate measurement of biological data in a single cell has become possible,which has provided an opportunity to study cellular activities from a single-cell perspective.Single-cell sequencing technology and the singlecell sequencing data are widely used in bioinformatics to study various biological molecules and life activities.In the process of biological development and disease occurrence,the cells that play an important role are usually in a low abundance.An important application of single cell sequencing data is to identify rare cell types.Based on single-cell sequencing data of melanoma,this thesis presents a method framework for identifying rare cell types.Using the dropout feature of single-cell gene expression data,the dropout subtypes of cells were identified.Further combining gene and long non-coding RNA(lnc RNA)expression data to identify cell types,and analyze to obtain rare cell types related to melanoma pathogenesis.In this thesis,we analyzed single-cell gene expression data and lnc RNA expression data from patients with metastatic melanoma,and realized the identification of melanoma-related rare cell types.First,considering the high dropout characteristics of single-cell sequencing data,the dropout information was extracted from the single-cell gene expression matrix,and the dropout feature matrix was constructed,then the dropout distance between cells was calculated.Further,a density-based clustering algorithm was performed and four dropout subtypes of cell was identified.According to the performance of genes and lnc RNA in different dropout subtypes,a differential analysis was performed from the perspective of dropout and expression levels to identify a series of differential genes and differential lnc RNAs,and a differential co-expression network between different molecules was constructed.Further,on the differential co-expression network,Markov clustering was used to identify the difference modules,and the molecular expression level of each difference module was used to calculate the difference module features of the cells.Then density-based clustering was used to identify the cell types and a total of eight cell types were obtained.Finally,for each cell type,cell markers that were significantly different from other cells in the expression level was identified,and prior knowledge and cancer genetic analysis identified two rare cell types associated with melanoma.Enrichment analysis was performed for the two rare cell types to investigate the pathogenesis of melanoma,and these cell marker genes and cell marker lnc RNAs can be used as potential target molecules for melanoma treatment.In this thesis,by identifying rare cell types associated with melanoma and their cell markers,the pathogenesis of melanoma was analyzed,and it provided new clues for the diagnosis and prognosis of melanoma.In addition,the methods and ideas for the identification and analysis of rare cell types in this article can be extended to the identification of other cancer-related rare cell types,and provide a reference for other single-cell sequencing data processing methods. |