Font Size: a A A

Estimation Of The Mutation Rate During Germ-line Development Of The Male Drosophila Melanogaster

Posted on:2016-04-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:S M AiFull Text:PDF
GTID:1220330482470710Subject:Genetics
Abstract/Summary:PDF Full Text Request
The estimation of the mutation rate is important in genetics and evolution of life because mutation is the fundamental source of life evolution and the estimate of the mutation rate is the basis for in-depth study of evolution-related fields. The estimation of the mutation rate is a complicated problem. Traditionally, the muta-tion rate, which is assumed to be a constant, is defined as the rate of DNA sequence mutation per generation for an individual in a population, which is also referred to as the population mutation rate.The estimation of the mutation rate at certain stage of individual development, e.g. germline development, has yet to to be investigated extensively. Recently our lab’s research has explored the estimation of the recessive lethal or nearly lethal mu-tation rate during the germline development of the male Drosophila melanogaster. In this work, the cell-coalescent-theory based maximum likelihood statistical frame-work was used to analyze the mutation data collected from large-scale mutation screening experiment, gaining a deep insight into the mutation rate during the germline development of the male Drosophila melanogaster. An striking coclusion of this work is that the lethal or nearly lethal mutation rates in the process of the germline development of the male Drosophila melanogaster vary significantly.On the basis of the above work, in this dissertation, a statistical framework was established and a numerical algorithm was explored for improving the estimation of the recessive lethal or nearly lethal mutation rate during the germline develop-ment of the male Drosophila melanogaster. The statistical framework developed for estimation the mutation rate is minimizing χ2-Although it is well known that the χ2 statistics is useful in testing the goodness of fit, it is less-well known that this statistics can also be used to estimate the parameter. In particular for the small sample, it has been shown that the estimation obtained by minimizing χ2 is better than that from the maximum likelihood method.One aim of this work is to improve the efficiency of the mutation rate estimation. The maximum likelihood estimation is time consuming because the grid search method was used to obtain the optimal result. The advantage of the minimizing χ2 method introduced in this paper improved significantly the efficiency of the maximum likelihood method. After some simplification, minimizing χ2 problem become equivalent to a quadratic programming problem so that it can be easily solved by existing theories and algorithms. Here in this paper we use the Lemke complimentary pivot algorithm to solve this problem. The results show that, when compared to the previous grid search approach, our method can obtain the optimal result within several second or minutes on a PC depending on the complexity of data, thus improving greatly the efficiency of the mutation rate estimation. In addition, the grid search method under the maximum likelihood framework will become failed with the increasing dimensionality of the parameter while the new method can still deal with this situation.The result of mutation rate estimation by the minimizing χ2 method is con-sistent with that by the maximum likelihood method because these two methods are asymptotically consistent when applied to the large sample. The simulation result shows that the Lemke algorithm-based minimizing χ2 method is robust and is especially suitable to be applied to estimate of the rare mutation rate, e.g. that of the lethal or nearly lethal mutation. Because the developments of the germ cell and the somatic cell are similar, this new method can be expected to be applied to the somatic cell data of interest. This estimating method of the mutation rate by minimizing χ2 is also expected to be used to analyze the data produced by the next generation sequencing technologies because such techniques can sequence individual cells for which the data structure is similar to that of the male Drosophi-la melanogaster. On the other hand, the analyzing results of the next generation sequencing data can also help to design the experiments, such as optimizing the length of the sequence and the sample size for improving the sequencing accuracy. Facing increasingly growing molecular data, Developing reliable and efficient ana-lyzing method is desirable under the current situation of ever-increasing molecular data. The method presented here will be expected to have a extensive application prospect in molecular biology thanks to its high efficiency.
Keywords/Search Tags:male Drosophila melanogaster, Germline development, Estimation of mutation rate, Coalescent theory, Minimizing χ~2, Lemke algorithm
PDF Full Text Request
Related items