Font Size: a A A

Studies On Statistical Methods For Mapping Epistatic QTL Using Genetic Mating Designs

Posted on:2011-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H HeFull Text:PDF
GTID:1223330368985491Subject:Genetics
Abstract/Summary:PDF Full Text Request
Most economically important traits of plant and animal are quantitative traits. Epistasis, the interaction between genes, is an important genetic component in the genetic architecture of quantitative traits and can lead to heterosis, which is very important in the hybrid breeding. In addition, it is a driving force in evolution, and plays a central role in founder effect models of speciation. Thus the detection of epistasis is quite important for both genetic research and breeding practice. Classical generation mean and variance component analysis, built on polygene hypothesis, use only phenotypic data of quantitative trait to infer the nature of the collective effects of all QTL and tell us nothing about the number, locations, effects and types of each interactive QTL. With the aid of molecular markers throughout the whole genome, epistatic QTL can be detected, and the detailed genetic architecture of quantitative trait can be inferred. Although statistical methods for detecting epistatic QTL have been well established, most of them are developed on segregating populations derived from an initial cross between two inbred lines, e.g. BC, F2, F2:3, DH, RIL etc. These simple crosses are rarely used alone in breeding, and therefore the results from these studies can not be directly applied to breeding practice. While many genetic mating designs, proposed in classical quantitative genetics, are tightly related with breeding activities. This drove us to develop some statistical methods for mapping epistatic QTL using mating designs. Three designs, including the triple test cross (TTC), four-way cross (4WC) and random hybridization of F2 plants (RHF2), were explored in this dissertation.At present, some statistical methods have been suggested for these three designs, but two major problems were still unsolved. Firstly, epistasis was not considered or not completely dissected. In TTC design, only aa and dd digenic epistatic effects were estimated, while ad and da were not detected. In 4WC design, all methods were based on single QTL model, and epistasis was not involved. The same was for RHF2 design advocated by Wen and Wu (2006) for mapping endosperm trait QTL (ETL). Secondly, some genetic effects were biasedly estimated. In TTC design, QTL main effects were confused with QTL×genetic background epistasis (ai with [dai] and di with [aai] under the F2 metric model). The two dominant and epsitatic effects of ETL obtained from populations by selfing F2:3 or BC were biased. Our studies were aimed to settle these problems and the results were presented as following:1. Mapping epistatic QTL using triple test cross design with F2 population A random sample from F2 population were backcrossed to the same three testers, the two parental lines (P1 and P2) and their F1, to produce three groups of families (L1i,L2i and L3i). Using the two-step procedure, all kinds of main and epistatic effects can be unbiasedly estimated by the association between data Zt (Z1i= L1i+L2i, Z2i= L1i-L2i, Z3i= L1i+L2i-2L3i) and the marker genotypes of F2 plant. In the first step, the augmented additive [ak*=ak+1/2Σl=1,l≠k(iakd1-idka1)/q,or a*k=ak-1/2Σl=1,l≠k/q idka1] or dominant effects [d*k=dk-1/2Σl=1,l≠k/q(iaka1-idkd1) or d*k= dk-1/2Σl=1,l≠k/q iaka1] and compounded epistatic effects (ikl=iaka1+idkd1 or ikl=iakd1+idka1) in the full genetic model that considered all putative QTL on the whole genome simultaneously were estimated in the analyses of the Z1 and Z2 data using the empirical Bayes approach; and the three pure epistatic effects (iakd1, idka1and idkd1) were obtained in the analysis of the Z3 data with two-dimensional genome scans using maximum likelihood method. In the second step, the three pure epistatic effects (iakd1,idka1 andudjd1) estimated in the first step were used to dissect the two compounded epistatic effects (ikl and ikl) to obtain all four types of pure epistatic effects (iaka1,iakd1,idka1 and idkd1); and these pure epistatic effects were further used to dissect the augmented main effects (a*k and d*k) to obtain pure main effects (ak and dk). Two sets of Monte Carlo simulation experiments were carried out to verify the proposed method. Results from simulation experiments showed that:(1) The new defined genetic parameters (the augmented QTL main effects and compounded epistatic effects) could be rightly identified with satisfactory statistical power, accuracy and precision. The two-step procedure could provide unbiased estimation for all the QTL main and epistatic effects. (2) The F2-based TTC design was superior to the F2 and F2:3 designs in many aspects. (3) The signs of pure epistatic effects could substantively influenced the statistical powers for the detection of compounded epistatic effects (ikl and ikl) with the Z1 and Z2 data, while it had no effect on the detection of QTL pure epistatic (iakd1,idka1,and idkd1) with the Z3 data. (4) The estimation of pure main and epistatic effects required large sample size and family replication number. The proposed method could be easily extended to other populations (such as RIL, BC and DH etc.) based TTC design.2. Mapping of epistatic QTL in 4WC 4WC involving four different inbred lines often appear in plant and animal commercial breeding programs. Direct mapping of QTL in these commercial populations is both economical and practical. However, the existing statistical methods for mapping QTL in a 4WC population were all built on the single-QTL genetic model. This simple genetic model fails to take into account QTL interactions, which play an important role in the genetic architecture of complex traits. Therefore, we developed a statistical method to detect epistatic QTL in 4WC population, Firstly, conditional probabilities of QTL genotypes, computed by the multi-point method, were used to sample the genotypes of all putative QTL in the entire genome. Then, the sampled genotypes were used to construct the design matrix for QTL effects. Finally, all QTL effects, including main and epistatic effects, were simultaneously estimated by the penalized maximum likelihood method (PML). The effects of QTL heritability, sample size, missing and dominant markers and QTL position on the suggested method were evaluated by a series of Monte Carlo simulation studies. Results from simulation experiments showed that our method could uncover both main and epistatic QTL with high power and give good estimation for QTL effects and positions. In addition, the genome-wide FPR was generally less than 0.3%. We further analyzed a real data of 4WC population from (Simian3/Sumian12)//(Zhong4133/8891) crosses in cotton. In total,13 main-effect QTL and 5 epistatic QTL were detected for 2.5% fiber span length. The heritability for a single QTL varied from 0.88 to 6.81%, and the total heritability of the detected main-effect and epistatic QTL was 47.74%. The four main-effect QTL that were originally identified by Qin et al. (2008) using MapQTL 5.0 were all detected by the new method. Furthermore, in all the epistatic QTL pair, one QTL had main effects, while the other did not. The results from the real data showed that the new method:(1) could detect epistatic QTL and the detection did not depend on whether or not the two loci both have main effects. (2) was more powerful for the detection of main-effect QTL compared with the interval mapping.3. Mapping epistatic QTL underlying endosperm traits using all markers on the entire genome in RHF2 design Triploid endosperm is of great economic importance, because it is main component of cereal seed and directly related to nutritious quality of human staple food and animal feedstuff. Mapping endosperm trait loci (ETL) can provide an efficient way to genetically improve grain quality. However, most triploid ETL mapping methods do not produce unbiased estimates of the two dominant effects of ETL. The random hybridization of F2 plant design was proposed to overcome this problem. Epistasis has an important role in the dissection of genetic architecture for complex traits, but most methods for mapping ETL ignored epistasis. Therefore, an attempt was made to map epistatic ETL (eETL) under the triploid genetic model of endosperm traits in RHF2 design. The endosperm trait means of random hybrid lines, together with known marker genotype information from their corresponding parental F2 plants, were used to estimate, efficiently and unbiasedly, the positions and all of the effects of eETL using PML. The property of the proposed method was investigated under different levels of ETL heritability, sample size, the number of seeds per plant and sampling strategy via Monte Carlo simulation experiments. Results from the simulated studies show that the proposed method provides accurate estimates of eETL parameters with a low false-positive rate and a relatively short running time. The successful case study of a simulated large genome 1260.00cM in length demonstrated that our method was competent for real data analysis.
Keywords/Search Tags:triple test cross, four-way cross, epistasis, quantitative trait loci, endosperm trait, empirical Bayes method, penalized maximum likelihood method
PDF Full Text Request
Related items