| Scientific workflow is one of the important solutions for large-scale scientific computing tasks.With the increasing complexity of scientific computing,the scale of scientific workflows is increasing,their complexity is also increasing,and the demand for computing resources is showing a dramatic increase.As an emerging computing service model,cloud computing differs from traditional high-performance computing systems in terms of the way computing resources are provided and the billing model,and these differences bring new challenges to the study of quality of service and reliability of scientific workflows deployed on it.Among them,Infrastructure-as-a-Service(IaaS)clouds are able to provide users with massive computing resources in the form of virtual machines(VM)over the network and meet the demand for computing resources from scientific workflows,and thus have received widespread attention.The research in this paper focuses on scientific workflows in IaaS cloud service environments,and it includes the optimization of two key scheduling issues: makespan and energy consumption.The paper’s main work and innovations are as follows:(1)In this paper,we consider the impact of cloud computing system heterogeneity and dynamic scalability on system performance and energy consumption,and we define the energy consumption and multi-objective optimization problem in a cloud environment.To broaden the search range of the solution space,a multi-objective optimization algorithm based on genetic algorithms is used to solve workflow scheduling,and then a comprehensive real-time decision preference is generated based on the cloud computing system’s current resource dynamic constraints and energy consumption requirements.The decision generates a compromise solution as the final scheduling solution.For the conventional workflow scheduling problem,whose initiation sequence is randomly constructed and leads to inferior initial solution results.To address this issue,an initialization scheduling sequence scheme is provided.The initialization approach assigns VM instances to each task while taking into account the data amount of each task rather than choosing VM instances at random.According to experimental findings,a low makespan can be initially attained by depending on the scheduling sequence created by the resulting method,and an appropriate tradeoff between makespan and energy consumption can be eventually attained.(2)Since the tasks in the workflow have certain constraints on each other,the tasks need to be executed sequentially according to these specific constraints.In this paper,the sequence of task execution order is used to represent the scheduling order of each task,and different sequences of task execution order have a large impact on the scheduling results.Since the task execution sequence needs to satisfy the constraints between tasks,each task execution sequence will contain the same sub-sequence.And the same subsequence contained in the better task execution sequence may be beneficial to the goal optimization.Therefore,this part of subsequence needs to be searched and retained.To address this characteristic of the workflow scheduling problem,a strategy combining the longest common subsequence search(LCS)is proposed.LCS is used to search the elite part of the task scheduling sequence and use this information to dynamically adjust the crossover and variation processes of chromosomes to improve the solving efficiency.Experimental results point out that the search strategy combined with LCS proposed in this paper can find a better Pareto front.(3)To address the issue of high complexity in large-scale workflow scheduling problems,we combine LCS to divide the population into several subpopulations,forming a multi-population framework,and retaining the corresponding LCS gene blocks.The LCS gene blocks are used to guide the crossover and mutation methods of different subpopulations,so that each subpopulation co-evolves in a different way,balancing population diversity and improving algorithm convergence.The experimental results show that the multi-objective genetic algorithm based on LCS selection proposed in this paper can effectively achieve the optimization of makespan and energy consumption for workflow scheduling. |