Font Size: a A A

The Research And Implementation Of High Efficiency Recognition Technology For File Fragment Type

Posted on:2018-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:C JiangFull Text:PDF
GTID:2346330515966716Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The type recognition of file fragments originated in the field of digital forensics is the process of recognizing the original file types of incomplete data blocks.With the deepening research of file fragment type recognition,this recognition technology has not only limited to digital forensics,but also developed to the field of network security,reverse engineering and network protocol analysis.In these areas,"how to quickly and accurately identify the file type of file fragments or the file type of data contained in data packets" is a critical issue.The existing methods mainly focus on improving the recognition accuracy of the file types of file fragmentation,but these methods often have problems in classification performance under the current real scene for mass data.In this paper,we study the problem of massive file fragmentation type identification,and propose the corresponding solution.Firstly,this paper proposes a file fragment type recognition method based on hierarchical model.Aiming at the problem that the number of categories needed to be classified in file fragment type recognition is increasing,the classification accuracy is worse.First,the file types are clustered to reduce the number of file types of file fragmentation in the first classification process,and then in the various file clustering class to refine the identification of the file fragment belongs to the file type.The experiments were performed on a file fragment dataset containing 44 file types.The results of experiments show that the average recognition accuracy of file type recognition is 63.5% and the average recall rate is 69.8% for the file fragmentation dataset containing 44 file types.Secondly,this paper presents a MapReduce based file fragmentation type recognition method,to solve the problem of file fragment recognition for speed and scalability.Based on the Map Reduce programming model,the original large-scale file fragment data set is divided into small data sets of the same size according to the number of Map,SVM iterations training in each Map side,the support vector obtained after two iterative training.Finally,the SVM model for the classification of file fragment types is established.The experimental results show that the average accuracy of the recognition of file fragments is 71.6%,and it takes time for the training data to be drastically reduced,along with the increase in work machines.In this paper,two methods are proposed based on the problem of the actual file fragmentation recognition,try to use hierarchical classification and distributed computing model to improve the accuracy of file fragmentation recognition and identification speed.
Keywords/Search Tags:File fragmentation, Type recognition, Hierarchical model, Machine learning, MapReduce
PDF Full Text Request
Related items