| Under the guidance of the national "Internet +" smart city strategy,a nationwide upsurge of establishing the government data service systems has been launched.However,the government big data service systems are prone to performance bottlenecks under the pressure of massive data.If we cannot locate the causes and take corrective measures in time,there will be at risk of system crashes,which may cause serious consequences.Therefore,it is urgent to design an efficient performance test platform.This thesis focuses on the research on the key technologies of big data systems performance testing,and further establishes the performance test platform of the government big data service systems.These research results will directly provide strong technical supports for the test verification and tuning of the government big data service systems.The main contributions of this thesis are as follows:(1)In order to locate the causes of the performance bottleneck of the government big data service systems,firstly,a bottleneck resource location method based on information gain rate is proposed from the perspective of hardware resources.By innovatively applying the information gain rate index of C4.5 algorithm to the bottleneck resource location problem,the impact of various hardware resources on system performance is quantified,so that the location results are more accurate.Secondly,two kinds of resource-intensive loads are used to simulate the scenarios where the big data system produces two performance bottlenecks.Finally,the method is used to evaluate the two scenarios separately.The experimental results show that the method can effectively locate the bottleneck resources of the big data systems.(2)In order to solve the performance bottleneck problems caused by different resources,a parameter tuning scheme based on sensitive parameter screening and pruning strategy is proposed.Firstly,in order to quantify the degree of correlation between parameters and resources,a sensitivity index is proposed,and a parameter screening method based on sensitivity is proposed based on this index,so as to select more sensitive parameters for different bottleneck resources as tuning objects.Secondly,in order to improve the efficiency of searching for the optimal parameter configuration,a pruning strategy based on the performance index under the default parameter configuration is proposed,and a parameter tuning method based on the pruning strategy is designed.(3)On the basis of the above research,a government big data performance testing framework is designed,which integrates performance testing,bottleneck positioning and parameter tuning.Based on this framework,a performance test platform for the government big data service systems is completed.The performance test platform is divided into a load test module,a resource monitoring module and a parameter tuning module.Among them,the parameter automatic tuning tool implemented in the parameter tuning module can be deployed locally,which greatly reduces the coupling between the parameter tuning module and the big data systems.The tool is used to optimize IO-intensive and memory-intensive loads,which reduces execution time by38.45% and 27.1% respectively.In addition,the tuning tool is used to optimize the short-term traffic flow forecasting application deployed on large data systems.The experimental results show that the tuning strategy can be applied to the actual government scene applications,and the validity and versatility of the parameter tuning scheme are verified. |