| With the rapid increase of civil aviation traffic,the huge amount of data will generate a large amount of duplicate data during disaster recovery and backup,which brings considerable pressure to airports and airlines.Whether in terms of cost,system level,or performance,the massive amount of duplicate data has caused a lot of problems in the storage and disaster recovery backup of civil aviation data,such as large data volume and long backup time.Therefore,the civil aviation disaster recovery backup system has an urgent need for suitable data deduplication methods.In response to these problems,this article first proposes a fixed-length block method based on the characteristics of civil aviation data.In this method,according to the constant characteristics of the specific type of data structure of civil aviation data,a data structure of the block strategy index table is designed,which can provide a block strategy for the same type of data,and saves the search for data in blocks.The time of the block boundary improves the efficiency of data deduplication during disaster recovery backup.At the same time,a new block strategy is established for new data types to facilitate subsequent matching of data streams and improve hit accuracy.Secondly,with the application and popularization of Persistent Memory(PM),this article improves the traditional deduplication method based on mechanical hard disks and proposes a method for comparing data deduplication based on persistent memory.According to the characteristics of small length and large quantity of civil aviation data,this method adopts a location-based content comparison deduplication method.First collect the fingerprint of the file data block,extract the fingerprint sample,and then use the persistent memory to locate the file location according to the fingerprint sample ID,determine whether the matching content needs to be subjected to secondary detailed analysis,and finally perform deduplication or backup.Compared with the traditional data deduplication method,the final experiment proves the superiority of the optimization method in the article.The data that can be removed in the disaster recovery backup of the civil aviation database accounts for about 98.08% of the duplicate data,which is compared with the traditional method.The repetition time is shortened by 1/2~2/3,which improves the efficiency of deduplication,reduces the pressure on the storage system,and minimizes the pressure on the bandwidth of network transmission. |