| Cotton is natural fiber crop discriminated as“white gold”,designated as crop of commerce since long ago used as vigorous oil-seed crop in the world.Cotton(Gossypium spp)is extensively widespread species crosswise from tropical to subtropical regions across the globe.This study evaluated the genetic diversity and population structures in cotton germplasm collection comprising 132 diploids,including diploid cotton Gossypium klotzschianum Andersson and allotetraploid cotton accessions Gossypium barbadense L.,Gossypium darwinii watt,Gossypium tomentosum Nutall ex Seemann,Gossypium ekmanianum Wittmack,and Gossypium stephensii S.G.Stephens,from Galapagos islands,Hawaiian islands,Dominican Republic,and Wake Atoll islands.A total of 111 expressed sequence tag(EST)and genomic simple sequence repeat(gSSR)markers produced 382 polymorphic loci with an average of 3.44polymorphic alleles per SSR marker.Polymorphism information content values counted 0.08 to 0.82 with an average of 0.56.Analysis of a genetic distance matrix revealed values of 0.003 to 0.53 with an average of 0.33 in the wild cotton collection.Phylogenetic analysis supported the subgroups identified by STRUCTURE and corresponds well with the results of principal coordinate analysis with a cumulative variation of 45.65%.A total of 123 unique alleles were observed among all accessions and 31 identified only in G.ekmanianum.Analysis of molecular variance revealed highly significant variation between the six groups identified by structure analysis with 49%of the total variation and 51%of the variation was due to diversity within the groups.The highest genetic differentiation among tetraploid populations was observed between accessions from the Hawaiian and Santa Cruz regions with a pairwise FST of 0.752(p<0.001).DUF819 containing an uncharacterized gene named yjcL linked to genomic markers has been found to be highly related to tryptophan-aspartic acid(W-D)repeats in a superfamily of genes.The RNA sequence expression data of the yjcL-linked gene GhA09G2500 was found to be upregulated under drought and salt stress conditions.The existence of genetic diversity,characterization of genes and variation in novel germplasm collection will be a landmark addition to the genetic study of cotton germplasm.Further study was extended to characterize the simple sequence repeat markers in cotton using the cotton expressed sequence tags.A total of 111 EST-SSR polymorphic molecular markers with trinucleotide motifs were used to evaluate the 79 accessions of Gossypium L.(G.darwinii,59 and G.barbadense,20)collected from the Galapagos Islands.The allele number ranged from one to seven,with an average value of 2.85 alleles per locus,while polymorphism information content values varied from 0.008 to 0.995,with an average of 0.520.The discrimination power ranks high for the majority of the SSRs,with an average value of 0.98.Among 111 pairs of EST-SSRs and gSSRs,a total of 49 markers,comprising nine DPLs,one each of MonCGR,MUCS0064,and NAU1028,and 37 SWUs(D-genome),were found to be the best matched hits,similar to the 155 genes identified by BLASTx in the reference genome of G.barbadense,G.arboreum L.and G.raimondii Ulbrich.Related genes GOBARDD21902,GOBARDD15579,GOBARDD27526,and GOBARAA04676 revealed highly significant expression 10,15,18,21,and 28 days post-anthesis of fiber development.The identified EST-SSR and gSSR markers can be effectively used for mapping functional genes of segregating cotton populations,QTL identification,and marker-assisted selection in cotton breeding programs.Genetic maps not only dig out the genome organization and structure but also provide the chance of tagging superior traits for crop improvement through marker-assisted selection(MAS).Genetic maps developed to dissect the hereditary information and genetic variation at DNA level through PCR-based markers will be more effective in marker-assisted selectiondue to their abundance across the genomes and enivironmentally neutral.The mapping cotton genome will establish the foundation of progressive molecular genetics.A genetic map predicts the relative position of molecular markers along the chromosome consistent to the genes based on recombination events.F2 population derived from an interspecific cross of G.barbadense(the cultivar name as XH-18)×G.darwinii 5-7 was used to establish a genetic map for molecular study.Polymorphic simple sequence repeat(SSR)markers were surveyed to genome of tetraploid cotton.The map consisted of 613 markers loci distributed across all the26 chromosomes and covered 2371.4 cM of cotton genome,with an average inter-markerdistance of 9.35cM.Marker number anchored on the chromosomes varied from 5 to 76 with an average of 23.57 on each chromosome.More markers were mapped on D sub-genome(83.03%)than A sub-genome(15.66%).The maximum length of chromosome was 143.387 cM and the minimum was 58.430 cM with an average length of 91.207 cM.D sub-genome covers more genetic distance(1225.613)with an average distance of2.949 cM than Asub-genome which covers a length of1145.771 cM with an average distance of 15.755cM.The only one homeologous chromosome pair was Chr.13(A13)and Chr.18(D13)was observed in our genetic map.There were 257 distorted loci in the map accounting for 41.92%and more distorted loci were distributed on D sub-genome(34.58%)than A sub-genome(7.34%).In our map 51 segregation hotspots(SDR)were distributed across the genome with more on D sub-genome as compared to A genome.All the skewed alleles within one SDR segregate in the same direction.Further transcriptome analysis of fiber characteristics was conducted to diagonose the gene involved in different stages i.e.0,5,10,15,20,and 25 DPA of fiber development;the main product of cotton.A total of 12,748 DEG genes were identified,which were expressed in no less than half of the 12 libraries.High numbers of DEGs were found in XH-18;about 99220 while G.darwinii has 75307.Interestingly,a high similarity of expression pattern of DEGs in darwinii 5-7 and XH-18 at all DPAs was identified,which is consistent to the above-mentioned PCC results.The map constructed through these studies is the first genome wide SSR interspecific genetic map between G.darwinii and G.barbadense.This map will set a landmark step to dissect the genome structure of G.darwinii alongwith targeting candidate genes involved in fiber development.It will also open the doors for further in-depth genome research such as fine mapping,map-based cloning,evolutionary studies,tagging genes of interest from wild relatives,MAS and comparative mapping not only in cotton but also with other species as well. |