Font Size: a A A

Research Report For The Problems Of Bioinformatics

Posted on:2013-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:S X ZhangFull Text:PDF
GTID:2230330371983495Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Bioinformatics is one of the the most dynamic and frontier subject in the study of Life Science. The main objects of this subject are nucleic acids, proteins and other biological macromolecules data, mathematics, information science, computer science are the main means, the main tool of the subject are computer hardware, software and computer networks, the vast amounts of raw biological data are storaged, managemented, commented, and processed, so that it become clear biological significance of biological information. the knowledge of the genetic code, gene regulation, nucleic acid and protein structure and function of their mutual relations are obtained By the query, search, comparison, analysis of biological information; Establishing "the biological periodic table" on the basis of the large amount of information and knowledge, exploring major issues in the life sciences, such as the origins of life, biological evolution and cells, organs and individuals occur, development, pathological changes, decline and so on, sorting out the basic law and the time and space relations.Taking bioinformatics as the summarizing background, this paper discusses the research background, research status and research contents of bioinformatics, and the the major research methods of mathematical statistical dynamic programming, machine learning and pattern recognition technology.Taking calculation intellectual method as the reviewed mainline, This paper discusses main principle of bioinformatics. The paper emphatically introduces two intelligent algorithm method:the artificial Neural network algorithm and Genetic Algorithm, the application of the artificial neural network in the protein secondary structure prediction and the application problem of genetic algorithm in protein two-dimensional crystal lattice are also analysed. To apply artificial neural network to predict protein secondary structure, first of all, is using protein database (such as PDB database) providing the amino acid sequence of structure which is known as neural network training set to carry on the effective sample code; then put the sample in network after coding. Through certain neural network algorithm to train the network and adjust each weights parameters and threshold parameters, making the learning objective function smallest, enabling the network stability. Applying trained neural network weights and threshold parameters to predict the unknown protein secondary structure. If using the structure of the protein sequence as a known test set, we can calculate the output and the actual observation results are compared, and the neural network to measure precision of prediction.Prediction accuracy is the only measure to judge the accuracy of prediction. At present, the mainly calculation method of prediction accuracy is Q3method, and its computation formula is:Genetic algorithm is applied to study in protein folding problem. The specific operation of genetic algorithm, the first to determine the population size, maximum evolution algebra, crossover probability and mutation probability and so on, and then establish initial group. From the initial group after, through the selection operators, crossover operator, mutation operator cycle operation, output by the ultimate least energy and the minimum energy form random sequence. According to the random sequence, we can get the protein folding two-dimensional grid prediction results. The algorithm is an improved method is to use genetic algorithm and simulated annealing method of combining the hybrid genetic algorithm. The hybrid genetic algorithm by increasing the simulated annealing factor, keep some poor individuals, make groups of the diversity of the conformation and fitness improved, this prevents algorithm in local optimum. In addition to the hybrid algorithm outside, still can use the optimal preservation strategy. The optimal preservation strategy is in the crossover and mutation after the operation, the first find out the current group of individuals and fitness of the highest fitness lowest individuals. If the current population in the fitness of the best individual so far better than individual fitness even higher, with the best of the group is the individual as a new so far the best individuals. Finally, with the best so far individual replace the current group of individual worst. Through for protein folding problem in recent years of further research, the genetic algorithm, the simulated annealing method and Monte Carlo Method is put forward by some intelligent method and combining with the hybrid algorithm, with its unique advantages. Therefore, how to specific problems of bioinformatics, developed a mixed algorithm, but also people in the future research will focus on one of the major problems.Of artificial neural network and genetic algorithm, the two methods of discussion, be helpful for more in-depth understanding of bioinformatics intelligent algorithms and application issues, but also for the back of the study of a certain research foundation.
Keywords/Search Tags:Bioinformatics, protein structure prediction, neural network algorithm, geneticalgorithm
PDF Full Text Request
Related items