| With the continuous emergence of open source technologies and the development of cloud computing,the workflow management system can be used as Saa S(Software as a Service)to create more value for government and enterprises.That enables the workflow management system deployed in the cloud to easily achieve resource supply to a certain extent,it satisfies high concurrency and high availability of the system.Due to the increased number of users and the complexity of the process business,the workflow management system often receives a large number of sudden process instance requests,as dynamic load,and faces the following two problems.On the one hand,since the existing system continuously replaces the quantitative cache,when the request load increases,the cache hit rate of the system decreases,which leads to repeated parsing of the process definition would increase the request response time.On the other hand,as the load changes,the resource requirements of the worker in the system have to constantly fit that change,bud traditional workflow management system deployed on the cloud platform cannot adjust resources in time to ensure service quality.Given the above problems,paper carries out related research based on the distributed workflow management system.The system includes a master and a worker.The master includes a scheduling module,a monitoring module,etc.;the worker is the execution node for running tasks,providing business functions such as process modeling,management,and statistics.Then paper studies the above problems from the two aspects of request distribution scheduling and resource scheduling,and then proposes a task request scheduling strategy based on cache and a hybrid resource scheduling strategy that supports automatic scaling,which is explained from two aspects.On the one hand,based on the existing cache of the workflow management system,a database cache is added to the system,and a request distribution algorithm is proposed to maximize the use of the cache.The algorithm is grouped according to whether there are process objects in the worker nodes.The process instance requests of the same process are distributed to the worker of the cached process objects as much as possible,so that each request can fully re-use the process objects in the cache.The cache hit rate of the process object reduces the request response time of the worker.On the other hand,a monitoring module is established to collect the resource characteristic data of the working nodes,and then a load prediction model is designed.The hybrid resource scheduling that supports automatic scaling is carried out through the predicted characteristic data,that is,increasing or decreasing the number of replicas of the worker.resources would not be wasted or in short supply under the condition of meeting resource requirements,which would ensure the quality of service of the system.The experimental results show that fact.On the one hand,the proposed task request scheduling strategy improves the cache hit rate,reduces the service response time,and improves the quality of service compared with the two commonly used scheduling strategies in business.On the other hand,the resource scheduling strategy that supports automatic scaling can expand and shrink the capacity of worker nodes in advance according to sudden of the load,which improves the quality of service and the resource utilization of the cluster. |