Font Size: a A A

Studies On New Methods For Constructing Core Collection Of Plant Germplasm Resources

Posted on:2007-12-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:J C WangFull Text:PDF
GTID:1103360182992456Subject:Crop Science
Abstract/Summary:PDF Full Text Request
Core collection is a new field in plant germplasm research. The main aim of core collection research is to find effective methods to conserve maximum genetic diversity by minimum accessions. However, there is not a widely accepted method for constructing core collections up to now. A cotton germplasm population including 168 genotypes and a rice germplasm population including 90 genotypes were used as worked examples. Genotypic values of the accessions in the germplasm populations were unbiasedly predicted by mixed linear model approach. A series of methods including Monte Carlo simulation were conducted to investigate systematically the methods of core collection construction. Main results were achieved as follows.1. A method on constructing core subsets by least distance stepwise sampling (LDSS) was proposed. This method was based on cluster analysis, and the sampling is performed in the subgroup with the least distance in the dendrogram during constructing a core subset. The results showed that constructing core subsets in the same sampling percentage, the cluster times of LDSS method were far more than those of the method of stepwise cluster based on random sampling (SCR). The cluster analysis showed that core subsets constructed by LDSS method well preserved genetic variation and structure of the initial population. The genetic diversity analysis showed that core subsets constructed by LDSS method were well representative to the initial population and more representative than SCR method. All cluster methods were found to have completely constructed the same core subsets as long as genetic distance and sampling percentage were fixed when LDSS method was used. Therefore, cluster methods do not need to be considered when using LDSS method to construct a core subset.2. Effective evaluating parameters for core collection were selected. Monte Carlo simulation combining with mixed linear model were used in the research of evaluating parameters for core collection based on genotypic values and molecular marker information, which eliminated the interference of environment and drew more reliable results. The results showed that the coincidence rate of range (CR) was the optimalevaluating parameter. Mean Simpson index (MD ), mean Shannon-Weaver index of genetic diversity (MI )and mean polymorphism information content (MPIC ) were important evaluating parameters. The variablerate of coefficient of variation (VR) could act as an important referential parameter for evaluating the variation degree of core collection. Percentage of polymorphic loci (p) could act as a determination parameter for the size of core collection. Mean difference percentage (MD) was a determination parameterfor the reliability judgement of core collection. The effective evaluating parameters for core collection selected by present research could be used in different plant germplasm population to act as criteria for sampling percentage.3. The available genetic distances for constructing core collection were selected. 6 common used genetic distances (Euclidean distance, standardized Euclidean distance, Mahalanobis distance, city block distance, cosine distance and correlation distance) and 4 common used hierarchical cluster methods (single distance method, complete distance method, unweighted pair-group average method and Ward's method) combining Monte Carlo simulation were used to evaluate the characteristic and validity of different genetic distances in the construction of core subsets. The analysis of variance of different evaluating parameters showed that the validity of cosine distance and correlation distance were inferior to Euclidean distance, standardized Euclidean distance, Mahalanobis distance and city block distance. Standardized Euclidean distance was slightly effective than Euclidean distance, Mahalanobis distance and city block distance. The Monte Carlo simulation showed that, compared to Euclidean distance, Mahalanobis distance and city block distance, standardized Euclidean distance was more available. At the same time, the principal analysis and cluster analysis validated the validity of standardized Euclidean distance in the course of constructing practical core collections. The present research also found that the covariance matrix of accessions might be ill-conditioned when Mahalanobis distance was used to calculate genetic distance , which led to bias in calculating the genetic distance. It suggested that Mahalanobis distance might not be suitable for constructing small core collection.4. The relationship between the number of traits and core collection constructing was investigated based on a cotton germplasm population with 20 quantitative traits. The results showed that, when the sampling percentage was small, the number of traits should be increased in order to increase the representativeness of the initial population, which led to more precise genetic distance among germplasm accessions. When the sampling percentage was large, the number of traits may properly reduce. The present research also proposed a method for determining reasonable sampling percentage and corresponding number of traits.5. Method on combining data of genotypic values and molecular marker information for constructing core collection was proposed. Core collections may be more representative if combining genotypic values and molecular marker information for constructing core collection because it can synthesize advantages of the two types of data in buiding core collection. However, genotypic values are continuous data, and molecular marker information is discrete data. It is difficult to combine these two types of data forconstructing core collection by cluster. A rice germplasm group with 8 quantitative traits and information of 60 molecular markers was used to study methods for combining data of genotypic values and molecular marker information at the level of core subset. Different constructing methods and evaluating parameters were conducted to assess the combining method. The results showed that, for SCR method, core collection constructed based on mixed data of genotypic values and molecular marker information was more representative than that constructed based on single data of genotypic values. For LDSS method, the validity of genotypic value distance (fig) and molecular marker information distance {Dm), which proposed by present research, was equal to or higher than the common genetic distances. Core subsets constructed by Dg showed relative poor representativeness based on the evaluating parameters for molecular marker information, and those constructed by Dm showed relative poor representativeness based on the evaluating parameters for genotypic values. Core subsets constructed by mixed genetic distance (Dmix), which consists of Dg and Dm, were significantly more representative than those constructed by Dm based on the evaluating parameters for genotypic values and had no significantly different in representativeness than those constructed by Dm based on the evaluating parameters for molecular marker information. Furthermore, core subsets constructed by Dmix had poor representativeness compared to those constructed by Dg basedon the evaluating parameters for genotypic values, those core subsets were significantly more representative than core subsets constructed by Dg based on the evaluating parameters for molecular marker information. Therefore, core subset constructed by mixed data was more reasonable than those constructed by genotypic values or molecular marker information. Moreover, this method is not only for genotypic values and molecular marker information, but also able to combine any other continuous data and discrete data together to perform cluster for constructing plant core collection without considering the scalar differences between traits.6. All methods in this research, that include predicting genotypic values of germplasm accessions, the constructing methods of SCR and LDSS, calculating all evaluating parameters of core collection, Monte Carlo simulation for constructing germplasm populations and bootstrap, determining reasonable sampling percentage and corresponding number of traits and combining data of genotypic values and molecular marker information, were programmed by researcher. These software could be used by different plants for core collection research.
Keywords/Search Tags:Mixed linear model, Stepwise cluster based on random sampling method, least distance stepwise sampling method, Monte Carlo simulation, Genetic distance
PDF Full Text Request
Related items