Anessentialproblemingeneticanalysisisinferringhaplotypepairsfromunphasedgenotypedata. Inthispaper, amethodisproposedanovelalgorithmCSPooltoestimatehaplotype frequencies and and reconstruct diplotype for each individual from individ-ual genotype. CSPool is better than or comparable with former algorithms in terms ofaccuracy and robustness of estimating haplotype frequencies for various sample size,and can solve the problem in a much faster way. Besides, compressed sensing is firstlysuggested by CSPool to estimate haplotype frequencies, which can directly apply thesparsity prior to the model and estimate haplotype frequencies efficiently and accu-rately. A P-L trick consisted to CSPool was proposed in this paper, so CSPool is ableto handle long SNPs problems. At last, A stand-alone GUI version and a web-basedCSPool were created and they are ready to use. It can be foresee that CSPool wouldhave a bright future since next-generation sequencing technology prevails.
|