| With the rapid development of Internet of Things(Io T)technology and 5G communication networks,an increasing number of computationally intensive intelligent Io T applications have emerged,such as smart manufacturing,smart transportation,smart healthcare,and virtual reality,among others.The efficient and reliable execution of Io T applications is crucial for improving application performance,avoiding failure that may impact smooth completion of applications,enhancing the users’ quality of service,improving overall Io T system efficiency,and promoting the development of Io T applications.The design of efficient task offloading mechanisms and accurate task failure prediction methods are two key factors in ensuring efficient and reliable execution of tasks.During the task offloading phase,Io T terminal devices can offload the tasks to edge servers or remote cloud servers with more abundant resources through offloading techniques,thereby alleviating the conflict between their limited resources and the complex application demands.Nevertheless,due to the heterogeneity of resources in the cloud-edge environment,the complex inter-task dependencies,and uncertainty regarding the scale,structure,and generation time of tasks,the design of offloading strategies becomes extremely challenging.In the task execution phase,tasks that have been offloaded to the cloud for execution are prone to failures due to the complexity and dynamism of the cloud environment.Thus,it is necessary to establish accurate failure prediction models to predict failures precisely and take appropriate measures to reduce their impact.This thesis focuses on the problems encountered in the task offloading and execution phases and proposes an online task offloading mechanism based on policy gradients and an offloading task failure prediction model based on Transformer.The specific research work is as follows:(1)In order to design a task offloading mechanism that takes into account the heterogeneity of resources,the complex dependencies between tasks,and the uncertainty of task information,This thesis models the online task offloading problem in a cloud-edge environment as a Markov decision process model and proposes a policy gradient learning-based online multi-workflow offloading scheme(PGOMWO).The specific process of this method is as follows: First,tasks with complex dependencies are modeled as workflow tasks represented in the form of a directed acyclic graph,and graph convolutional neural networks are used to extract features that contain node attributes and internal dependencies of the workflow within the environment.Based on these features,the environment state is constructed.Then,the agent generates decisions according to the environment state and interacts with the environment,continuously optimizing its decision-making level according to the feedback from the environment.PG-OMWO can handle workflow tasks that arrive at the environment at an indefinite time and analyze the current status of workflow tasks and servers in real-time to determine the timing and execution location of task offloading,with the goal of optimizing the average completion time of multiple randomly arriving workflow tasks.Finally,through a large number of simulation experiments,this thesis demonstrates that compared with other representative baseline algorithms,PG-OMWO can always learn better task offloading strategies in different cloud-edge environments and achieve the lowest average completion time.(2)In view of the problem that it is difficult to accurately predict whether a failure will occur during the execution process after the task is offloaded to the cloud for processing,this thesis proposes a 1DCNN-Transformer-based offloaded task failure prediction model.The specific process of this method is as follows: First,composing a time series of information such as CPU and memory usage during task execution,and modeling the task failure prediction problem as a binary classification problem based on time series.Then,the combination of one-dimensional convolutional neural network(1DCNN)and Transformer is utilized.1DCNN extracts local dependencies of time series,while the multi-head attention mechanism of the Transformer adaptively assigns weights for each time point,extracting global dependencies across different subspaces.By combining the two models,the feature representation of the time series can be complementarily extracted,thus improving the quality of feature extraction.Finally,extensive experiments were conducted on the cloud data center cluster dataset released by Google,and the results show that compared to previous related works,1DCNN-Transformer can more effectively extract potential dependencies in time series and has better performance indicators in prediction accuracy,precision,recall,F1 score,and ROC curve. |