| There are more than one million wheat(Triticum aestivum L.) ESTs in dbEST database. These ESTs from different cDNA libraries represent genes expressed in various tissue, stress treatments or development stages. So they have been used in many research fields of biological sciences, such as gene discovery, gene expression pattern research, gene evolution and so forth. In this study we constructed a digital northern platform for wheat genes and identified wheat specific genes relative to rice by ESTs data mining.In the process of digital northern platform construction, background database was created by classifying wheat ESTs according to different kinds of cDNA libraries. Expression patterns were tested by both Chi-square test and Audic and Claverie's Bayesian method based on the ESTs frequencies in each kind of cDNA libraries. The performance of our platform was validated by testing some wheat genes with known expression patterns. Wheat and rice both belong to the grass family, but their characteristic phenotypes and growth conditions vary significantly. The differences between genes are the key causes of their diversification. In this study 3,809 candidate wheat specific genes relative to rice were identified by comparing wheat ESTs to rice genes. Functional annotation results showed that many of specific genes are related to protein destination/storage and defence processes. Many genes with unknown function had NB-ARC domain, Zinc finger domain or cyclin-like F-box domain. These results suggested wheat specific genes were related to wheat development and the abilities to adapt to environmental conditions. To test the data accuracy,81 candidate specific genes were selected randomly and 54 genes were validated by cloning and sequencing.49 genes were not detected hybridization signal in rice genomic DNA by normal southern blot. The expression patterns of 17 genes coincided with the frequencies of ESTs in each kind of cDNA libraries. Sorghum bicolor specific genes relative to rice were identified by the same method. We compared these candidate specific genes with gene annotations of Sorghum bicolor and found that our bioinformatical pipeline was a good method to identify species-specific genes without genome sequences. |