| Monogenic disease is a genetic disease due to defects that are caused by a single gene, whose occurrence is related to the gene mutation. As the following of the Mendelian genetic law, it is also known as Mendelian disease.With the completion of human genome sequencing project and the rapid development of sequencing technologies, the ability to deal with genetic and genomic changes in human diseases has been unprecedently improved. Especially in the last few years, great success in identifying disease that causal single nucleotide variations(SNVs) for Mendelian disorders using next-generation sequencing technology bring us full of hope to understand the pathogenesis of Mendelian diseases. However, many hurdles for example structured management of data, the high false positive and false negative mutation rate, need to be overcome before the promises become widespread reality. To solve these problems, we have designed a pipeline and developed a toolkit for gene mutation identification and prioritization respectively. Finally to assess the validity of our developed softwarewe called variants, we performed the family-based sequencing on a healthy and 3 Freeman-Sheldon syndrome patient from the same famliy.In this study, we first introduced the principle of the next generation of DNA sequencing technology, genetic causes and various analysis strategies for Mendelian disorders. Then the gene identification pipeline is briefly described, including read alignment, duplicates marking, local indel realignment, base quality score recalibration, SNVs and genotype calling. Prioritization of candidate genes contains filtering out variants outside the coding regions, as well assynonymous coding variants, on the basis of the assumption that these will have minimal effect on the protein, reduction follows from excluding known variants commonly from dbSNP, 1000 genomes or in-house databases, family information and prediction software are ued to identify a disease locus.The developed toolkit is a systematic prioritization pipeline that makes use of information on variant quality, gene candidacy based on the number of novel nonsynonymous mutations in a gene, gene functional annotation, and prediction of functional impact of the coding variants.A new strategy is also used to aid the search for causal mutations in Mendelian disorders, by utilization of the individual or the falmily members’ s sequencing data. Therfore, it can effectively prioritize the small subset of functionally important variants from tens of thousands of variants in whole human genome. In general, our results for disease gene identification can help the biologists to overcome the limitation of available software, and provide significant help in prioritizing SNV calls in a systematic way as well as reducing search space for further analysis and obtain experimental verification. |