| Single-cell sequencing is distinct from first,second,or third-generation sequencing,which is a technique for amplifying and sequencing a single cell.Relying on the development of this technology,human research on genomics has reached an unprecedented new height.It is precisely because of the vigorous development of single-cell sequencing technology that downstream analysis of single-cell data has also been widely carried out.These include cell heterogeneity analysis,cell subset classification,differentiation and development analysis and many other research directions.However,single-cell sequencing data is limited by factors such as sequencing depth,its data is very sparse and has a lot of noise.Especially,dropout noise can seriously affect the calculation of cell-to-cell distances,which in turn affects the results of downstream analysis of single-cell data.Therefore,this paper proposes a dropout imputation method for single-cell RNA sequencing data based on graph neural network.The main research contents of this paper include the following three points:(1)Expand the format of cell sequencing data.Single-cell sequencing data stores the gene expression profile of each cell in the form of an expression matrix.However,since each cell in the expression matrix is independent of each other,the lack of cell-to-cell interaction makes it difficult to find the relationship between cells.Therefore,the cell expression matrix is processed and transformed to expand the single-cell sequencing data into non-Euclidean space.The cell-to-cell connection graph is constructed according to the Euclidean distance between cells,which improves the expression ability of each cell and provides rich feature data for subsequent model training.(2)Feature aggregation based on graph neural network.By using graph neural networks,gene expression profiles of similar cells can be aggregated,making up for the lack of sparse single-cell sequencing data.For some dropout events that occur in low-expressed genes,the cell nodes in the range of second-order neighbors can also be aggregated.The features of similar cells can be fully utilized on the premise of retaining the cell’s own expression content,which reduces the influence of feature sparseness on imputation dropout.(3)Introduce a multi-head attention mechanism.Due to the large number of similar cells and possible duplicates,a multi-head attention mechanism is introduced for the graph convolutional layer.The attention model is implemented by adding different weights to neighbor nodes and aggregating different nodes according to the value of the weights.Extending this attention model can realize a multi-head attention mechanism.The results show that this multi-head attention mechanism is more stable and can achieve higher accuracy than the single-head attention mechanism. |