| Performance evaluation is an important part of modern processor design.In previous studies,architectural simulation and analytical modeling are the two main evaluation methods.Architecture simulation tends to have high evaluation accuracy with high time cost due to complex design.Analytical modeling can shorten the time but not accurate enough to be fully evaluated.As one of the important means to accelerate the architecture simulation,Sampling simulation only simulates parts of intervals to achieve the approximate full simulation results based on the phased analysis of the program with shorter time and higher precision.In this thesis,two key steps of clustering and application characteristic selection in sampling simulation are analyzed.Starting from the performance parameters(CPI and DCache miss rate)that reflect the global and local characteristics of the program,different clustering methods,including Kmeans,support vector and long short-term memory,are evaluated by using reuse distance vector(RDV).According to the evaluation effect,Kmeans is selected as the clustering method in this thesis.Then the clustering results which based on application characteristic basic block vector(BBV)and reuse distance vector(RDV)are analyzed,and the intersections of the classification results of BBV and RDV are selected as a new sampling range.After that,the appropriate sampling point is selected from the sampling range of each phase.To simulate,only one sampling point is selected in each phase.The proportion of the number of intervals owned by each cluster in the new sampling range to the number of all intervals is taken as the weight and the simulation results are weighted and summed to obtain the overall performance parameters.Finally,the simulation results of each sampling method are compared and analyzed,including relative error and error distribution trend.This article selects the benchmark SPEC CPU2006 sampling simulation to verify the accuracy.The benchmark tends to have one hundred billion instructions of each test programs that the whole simulation time cost is too high.Therefore,this article chooses 10 billion instructions.On the Gem5 experiment platform,we achieve the program trace and the full-simulation values.The former is used to analysis program features,the latter is used to evaluate the sampling simulation accuracy.Compared with full simulation results,the average relative error of new method is 15% for DCache miss rate and 5% for CPI.Because BBV reflects the changes in the whole running process of the program,the error of CPI evaluation is small with large error of DCache miss rate.While RDV reflects the characteristics of the memory access that the evaluation results are opposite to BBV.In this thesis,the shortcomings of the two methods are optimized.Compared with the simulation results based on BBV alone,the average relative error of DCache miss rate is reduced by 13%.Compared with the simulation results based on RDV alone,CPI decreased by 62%.For the four microarchitecture configurations tested in this thesis,the proposed method can improve the evaluation speed by about 15 times compared with full simulation. |