Genetic diversity is limited in soybean in the US because only a few early plant introductions formed the original breeding pool. This study examined RFLP markers among samples of ancestral plant introductions, more recent plant introductions, and cultivars and elite lines from the northern US. Markers uniquely identified all lines examined. Cluster analysis grouped ancestors according to area of origin, while other lines formed groups in agreement with their pedigrees. Genetic distances among lines determined with RFLP, Random amplified polymorphic DNA (RAPD), and coefficient of parentage data were compared. Correlations between genetic distance and genetic variance of several agronomic traits were examined in two population sets over two years. Distance measures were generally positively correlated with genetic variances. There was a negative correlation with yield variance in one population set in one year. A multiple regression model using mid-parent yield and marker genetic distance predicted the highest yielding progeny. The relationship to mid-parent yield was always positive, but highest yielding progeny were negatively associated with genetic distance for one population set. The data herein suggest that using RFLP distance estimates for parent selection can increase the probability of producing transgressive segregates for yield. |