Font Size: a A A

Gene Expression Prediction Based On Convolutional Neural Network

Posted on:2020-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YuFull Text:PDF
GTID:2370330575477689Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The expression research of human genomics is a large example of the transformation between data and knowledge,and it plays an important position in the field of bioinformatics.The commonly used method in biology is to extract and predict gene expression profiles through biological means such as gene chips.The CMAP project constructed a large-scale gene expression library and found functional connections between certain small molecules.Most of the gene expressions of 22,000 genes known in humans are highly correlated.On this basis,the NIH LINCS project selected 978 genes called landmark gene,and referred to the remaining genes as target gene.It is believed that the gene expression profiles of these landmark gene can predict the gene expression profile of the remaining target genes.This idea can well solve the problem of expensive prediction of large-scale gene expression profiles in the past.In the process of continuous development,human society has gradually entered the era of artificial intelligence.At this time,the integration of disciplines has become the only way for technological development.Therefore,the scientists proposed that the gene expression profile of the target gene can be predicted by a computer method using the gene expression profile of the landmark gene.The NIH LINCS project was the first to start looking for a solution.Their initial attempt was the linear regression algorithm.But the disadvantage was that they could not capture the nonlinear relationships.Then,Chen et al.attempted to use the deep learning such as deep neural network algorithm to predict,and achieved a lower mean absolute error than the linear regression in most gene expression profile of target genes.In this paper,the model is partially improved based on the idea of deep neural network,and the model is called DNN-GEX.However,the output dimension of the model is large,and due to the connection mode of the full connection,the number of model parameters is very large and the model is complicated.In this paper,based on previous experiments,we try to find a more efficient and accurate algorithm to solve the problem of large-scale gene expression profilings.First,we try to use convolutional neural networks to reduce model parameters and reduce model complexity.The reason for choosing this algorithm is that we suspect that there may be local connections between the genes similar to the pixel points of the image.We call the model built by this algorithm C-GEX.The experimental mean absolute error of the C-GEX model are higher to those of DNN-GEX,but can effectively shorten the training time of the model.Second,we tried to use the integrated model of the light gradient boosting machine and called the model L-GEX.In general,the result of L-GEX model is slightly worse than DNN-GEX,which is better than linear regression and other linear models,but the shortage is the training time of all is too long.The result of this model has a characteristic that better results can be obtained in the prediction of the expression profile of some genes,but the remaining part is poor.Therefore,we use this feature of the model to combine with convolutional neural network because of the advantage of the shorter training time.The fusion model is called LC-GEX.Each of the above models has its own advantages and disadvantages.The experimental results in this paper can guide researchers to choose different models according to different needs: if the accuracy of the experiment is as high as possible,then the fusion model is selected;if the experiment requires the accuracy to be as high as possible and the time is as short as possible,try to choose the deep neural network model DNN-GEX;if the experiment requires training time as short as possible,then select the single convolutional neural network model C-GEX.
Keywords/Search Tags:gene expression profiling, convolutional neural network, Light Gradient Boosting Machine, fusion model
PDF Full Text Request
Related items