| With the continuous development of Internet, we are in an era of explosive growth of information. The enterprises, as the most widely used groups of Internet, is having their information growing very fast. In such a large data, especially the e-mail and instant messaging messages fastest growth rate as the representative of semi-structured data and various types of electronic files as the representative of unstructured data has the fastest growth rate. How to efficiently manage these data has become an important topic of corporate information technology departments.Data archiving system helps the enterprises to move a large number of very small access historical data from expensive primary storage to inexpensive equipment without losing the ability of real-time data access. It is an effective way for companies to reduce their operating costs. On the other hand, good archiving system provides secure data protection and efficient data retrieval services, enabling enterprises to respond to various evidentiary and other legal requirements.The traditional archiving system often uses a distributed architecture to handle large amounts of data of the enterprises, and it is complex to deploy and it cannot provide high reliability. With the development of cloud computing technology, we come to realize that compared to traditional archiving system, cloud computing based system is safe and reliable, simple deployment and high resource utilization, cloud-based archiving can significantly improve the efficiency.This thesis will analyze the current situation and problems of data archiving systems for enterprise and the cloud computing as well at first. Then, the author will provide a cloud-based archiving system model. We analyze the architecture on the distributed file system and archiving system design and deploy the system to make some experiments. Finally, author will give an analysis of the whole system scalability and propose some deficiencies. |