| With the popularity of the cloud-native concept and its rapidly expanding ecosystem,applications that adopt containerization and cloud deployment can easily enjoy the convenience of being in the cloud.Kubernetes is a container orchestration engine that orchestrates the containers of cloud-deployed applications within a cluster.It provides elastic scaling of applications while managing them to adapt to changing request loads.But for the default Kubernetes elastic scaling approach,this scaling decision relies on statically configured parameters and a reactive scaling strategy that is not flexible and proactive for dynamic web requests.So with the ultimate goal of improving resource utilization and application service quality,this thesis takes Kubernetes as the research object to investigate the elastic scaling strategy in a web application-based scenario,which can be specifically divided into the following three points:(1)Building a Web request dataset: In this thesis,we query and collect a large amount of real timing data of Web requests for a microservice Web application running on Kubernetes,and after data cleaning,aggregation,rolling window and other pre-processing operations,we make a Web request dataset that can be used for timing prediction.The feature of this dataset is that it records the request volume of microservices and the request volume of one of the interfaces at the same time,which makes up for the deficiency that other existing Web request datasets do not have fine-grained request data.(2)Proposing an elastic scaling approach for Kubernetes containers for Web applications: In this thesis,we propose a Kubernetes container elastic scaling method based on Web request volume prediction,so that the scaling behavior dynamically adjusts with the change of request load.To achieve the goal,this thesis first constructs an LSTM@Web request volume prediction model based on long and short-term memory networks,which can receive historical request volume data to predict the future request volume at the next moment,and its prediction effect is verified with accuracy in the comparison experiments with ARIMA.After that,this thesis designs a Web performance analysis method based on the M/G/1 queuing model,which can analyze the performance of the future Web application state by using the request volume prediction values from the LSTM@Web request volume prediction model.Finally,this thesis successfully merges the above two results and designs a specific algorithm that accepts historical request data as input and outputs the ideal number of Pods copies for Web applications on Kubernetes at the future moment.(3)Designing and implementing a Kubernetes container elastic scaling system for Web applications:This thesis designs and implements a Kubernetes container elastic scaling system for Web applications to enhance the original scaling mechanism and achieve a better resource scheduling effect.The system implements a prediction module and a scaling module based on the proposed elastic scaling method,and additionally supplements a request collection module and a monitoring module to provide request data collection and system performance observation functions,which can eventually read the arriving request information in real time and scale the replica set of the target workload horizontally in advance.This thesis then conducts simulations based on real request data generated from the Web request dataset,compared with the native Kubernetes HPA,to show the cloud resources used by the application and the number of requests reached within the specified response time.From the experimental results,the automatic elasticity system implemented in Kubernetes in this thesis is able to provide higher quality services compared to the default baseline.Finally,the proposed Kubernetes container elastic scaling system for web applications reduces its request loss by 25% compared to the original elastic scaling system,thus verifying that the quality of service is improved. |