| Multiple genome alignment is one of the most important fundamental subject in modern bioinformatics. To allow a direct comparison of the genome sequences of sufficiently similar organisms, there is an urgent need for software tools that can align more than two genome sequences.However, most of the current research focuses on pairwise genome alignment, and only a few available applications can efficiently align multiple genomes with a low identification efficiency.In this paper, we present an efficient algorithm with improved identification efficiency to align closely related multiple whole genomes, combining suffix arrays, conserved region, graph theoretic formulation and existing tools for gap (short sequence) alignment. Our algorithm first finds a longest increasing subsequence set (LIS) of aligned conserved regions among multiple whole genomes, then aligns the gaps between consecutive conserved regions with ClustalW.We present experimental results for our algorithm and give the analysis of the results. We use six sets of DNA sequences(human, mouse, mycoplasma, etc) in our experiments of multiple sequence alignment. The experiments show that the identification efficiency and time of our algorithm is improved as compared with other methods with comparable accuracy, such as MGA and EMAGEN. This algorithm is also proved feasible and efficient in aligning multiple sequences. |