Font Size: a A A

Improving Online Restore Performance Of Backup Storage Via Historical File Access Pattern

Posted on:2024-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:X P TangFull Text:PDF
GTID:2568307079960079Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the amount of digital information is increasing exponentially.To deal with problems such as machine downtime,disk damage,and malicious operations that make it difficult to recover user data after loss,backup and restore technologies have emerged as the times require.The research on the restoration method shows that the online restoration method can ensure that the user can operate the restored files during the backup restoration period,thereby effectively reducing the downtime during the restoration period and making it more widely used.However,this thesis finds that online restoration still has problems through simulation analysis of real collected data sets.Due to the inconsistency between the access sequence and the restoration sequence during the online restoration process,the file currently to be accessed may not be restored,making access to the current file unsatisfactory.This leads to problems such as low file availability and long-delayed time in the entire restoration process.This paper further obtains the user’s access frequency,correlation,and other access characteristics through the correlation analysis of the two data sets.Furthermore,the user access feature is used to solve the problem of user-perceived performance degradation,that is,the restoration sequence is scheduled based on the user’s historical access sequence.Specifically,this thesis proposes three online backup restoration approaches:(i)Frequency-based approach,taking advantage of the user’s large inclination to the access frequency of backup files and giving priority to restoring files with high access frequency of historical files.(ii)Graph-based approach using users to access files often in the form of a group of files,establish a file access association graph,and preferentially restores the frequently accessed files as well as their correlated files.(iii)A Trie-based approach that adjusts the order of file restoration based on users’ real-time and historical access patterns.That is,the historical access information of the backup file is established in the multi-level context structure Trie,and the restoration sequence is adjusted in time in combination with the file access information of the user during the restoration period to meet the real-time access needs of the user.This paper also implements a complete online backup and restoration system to evaluate the performance of the three proposed approaches.Trace-driven experiments on two datasets show that the system significantly improved file availability(up to 98% above)and reduced user latency(4 ×-700×).At the same time,the additional computing overhead brought by improving file availability and reducing user latency was also measured,and the entire system only increased a low-performance overhead(1.0%-14.5%).
Keywords/Search Tags:Online restore, Access pattern, Correlation graph, Trie
PDF Full Text Request
Related items