Font Size: a A A

Research On Algorithms Of Polyploid Haplotype Reconstruction

Posted on:2017-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:P Y YangFull Text:PDF
GTID:2180330482496151Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Haplotypes are not only applied to the study of phenotypic characteristics, but also widely used in the research of differences in the susceptibilities to diseases and the responses to environmental factors of individuals of a population. Therefore, there are great practical values and realistic significances on it. Due to the fact it is very costly to determine individual haplotypes just by the biology experiments, it is mainly using computer algorithms to rebuild the whole individual haplotypes from the DNA sequencing fragments in now days, which is the individual haplotype reconstruction problem. Most studies have focused on the haplotype reconstruction of diploid organisms. Recently, the study of the polyploid haplotype reconstruction problem is drawing more and more attentions. The polyploid haplotype reconstruction problem is more complex than the diploid one and needs new methods to solve it.This paper first introduces the background and the significance about the polyploid haplotype reconstruction problem, and also, introduces the current research status and progress on it. Then, based on the existing genetic algorithms to the polyploid haplotype reconstruction problem,this paper improves the old processes and puts forward new operators to make the algorithm more efficient and more accurate. In this paper the characteristics of the binary number are been used to create a new codingmethod which can be applied to extend the old algorithms to the k–chromosomes condition. In this paper, some existing limitations to the existing algorithms and some new characteristics of the polyploidy data have been found: repeat coding, fragments assembling on one chromosome when clustering fragments by codes, overflow when generating codes from fragments and local convergence because of the code fragments being staggered in different order. These limitations and characteristics can be used to improve the algorithms. Based on the randomization thoughts, this paper creates a new strategy to classify the fragments and avoid the biased distribution of the fragments. For the overflow problems, this paper increases the mutation chances of the overflowed codes or corrects those codes directly. For the local convergence problems, this paper proposes a new correcting operator to make misplaced codes find their own position. A large number of simulation tests show that the improved algorithm, comparing with the existing algorithms, can greatly improve the haplotype reconstruction precision.
Keywords/Search Tags:polyploid, haplotype assembly, genetic algorithm, algorithm design
PDF Full Text Request
Related items