Font Size: a A A

Development Of New Algorithms For Genetic Value Prediction Of Genomes Based On Artificial Intelligence(AI) Techniques

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:L L GuFull Text:PDF
GTID:2480306524458384Subject:Biology
Abstract/Summary:PDF Full Text Request
Genomic genetic value prediction is a method combining genomic marker information and phenotypic information to estimate individual genomic genetic value.It can be used not only for early selection breeding of plants and animals,but also for risk assessment of human diseases.This method is called Genomic Selection,Genomic Evaluation or Genomic Prediction,etc.The main contents of this study are as follows :(1)the genomic genetic value prediction algorithm ELPGV based on ensemble learning is developed;(2)A deep residual neural network based genetic value prediction algorithm was developed,named Res GS.(3)The above research contents are summarized,discussed and prospected.The main research results are as follows:(1)A genomic Genetic Value Prediction algorithm named ELPGV(Ensemble Learning of Prediction for Genetic Value,ELPGV)is proposed.It is a meta-algorithm that synthesizes the prediction results of several basic prediction models(such as GBLUP,Bayes A,Bayes C?)into more accurate prediction results through integrated learning group.Multiple data sets were used to verify the predictive performance of ELPGV.All the results show that the prediction accuracy of ELPGV is significantly higher than that of its integrated basic prediction model.In the risk assessment of six diseases in the WTCCC dataset,the significant P value was4.853E-118?9.640E-20;In the data set of Holstein cows,its P value was 9.943E-80?0.001E-00.Moreover,ELPGV does not use genotype data,so it saves a lot of computing resources.It only takes a few minutes to predict the WTCCC data set using a computer with 2GB memory.Secondly,a large number of experiments verify that the performance of ELPGV is affected by the similarity of the integrated basic prediction model,that is,the lower the correlation of the integrated basic prediction model,the higher the accuracy of the integrated prediction.Otherwise,the prediction accuracy of ELPGV will not be significantly improved if more similar basic prediction models are used to integrate.We will ELPGV develop into R bag,can be accessed through https: //github.com/Gu Lin Lin-JMU/ELPGV.(2)The algorithm Deep GS for the prediction of genomic genetic value based on deep learning technology has problems of low computational efficiency and low accuracy.We developed a new deep learning algorithm for predicting genetic value of genomes and named Res GS.The characteristics of the new method are as follows: a.the deep residual neural network is used to predict the genetic value,which can capture the complex relationship within the genotype and improve the prediction accuracy;b.Convolution and Pooling strategies are used to reduce the complexity of high-dimensional genotype data and speed up computation;c.The Batch Normalization Layer is introduced into the model to accelerate the convergence rate of the model.We applied the new method to a wheat dataset of 599 strains and a wheat dataset of 2000 strains from Iran,respectively.The results showed that the effect of Res GS was better than that of Feedforward Neural Network,with a relative improvement of 101.59%?130.83%.In predicting most phenotypes,Res GS was 2.24% to 20.19% better than GBLUP.Res GS is second only to GBLUP in terms of computation time,approximately 18 to 22 times faster than Deep GS and 24 to 26 times faster than FNN.Resgs effectively solves the problem that the accuracy of the model decreases due to the increase of the number of layers.Therefore,in the practical application,the prospect of Res GS is broader.This research uses artificial intelligence(AI)technology to develop a prediction model of genomic genetic value,the proposed ensemble learning model of genomic genetic value prediction(ELPGV)effectively improved the problem that specific genomic genetic value prediction models should be selected according to the genetic mechanism of specific phenotypes.Secondly,we also made corresponding improvements to the slow convergence rate and low accuracy of the deep learning genome genetic value prediction model Deep GS,so we proposed Res GS,which effectively solved the problems of slow convergence rate and low computational efficiency of deep learning and was more in line with the requirements of actual prediction.
Keywords/Search Tags:Neural network, Deep learning, Ensemble learning, Algorithm design, Genomic genetic value
PDF Full Text Request
Related items