Font Size: a A A

Research On Resource Allocation Technology For AI Model Deployment In Edge Computing Optical Networks

Posted on:2024-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhengFull Text:PDF
GTID:2568306944959149Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of Artificial Intelligence(AI)technology,AIbased applications and services have seen explosive growth.For the deployment of large-scale AI applications,the traditional solution is to upload massive data to cloud data centers for processing and return the results to users.Data processing requires a large amount of upstream bandwidth,storage,and computing resources.For some delay-sensitive services,it is difficult to meet service delay requirements and the service quality is not high.The emergence of edge computing technology allows some AI models to be executed at the edge of the network near the user side.As a result,AI models can be deployed through edge and cloud collaboration.However,an AI model consists of many layers with multiple partition deployment strategies,different strategies corresponding to different qualities of service.In terms of AI model deployment and resource allocation in edge computing optical networks,existing studies have the following shortcomings:the deployment strategy is fixed,which makes it difficult to adopt different deployment modes according to different service requirements and current network resource status,resulting in low resource utilization efficiency.How to deploy AI models more flexibly is still a problem to be studied.Combined with the above problems,the main contents and achievements of this paper include the following two points:(1)An algorithm of AI model partition deployment based on Deep Reinforcement Learning(DRL)is proposed.Aiming at the problem of how to realize the flexible deployment of AI model according to the state of network resources in edge computing optical networks,this paper firstly proposed a DRL based algorithm to model and solve the problem,and proposed three heuristic algorithms for comparison.According to the state of resources in the network,the proposed DRL based algorithm selects a suitable partition deployment scheme,and then allocates computing resources according to the delay requirements,finally completing the inference task.Simulation results show that the proposed algorithm can improve the deployment success rate of AI inference services under the same network resources.Then,the model is optimized to increase the action dimension,and two reward functions are proposed to optimize the probability of traffic blocking and inference delay respectively.(2)A right-sizing deployment algorithm of AI model in multi-task scenario is proposed.Different AI inference tasks have different time delay and precision requirements.Aiming at how to achieve right-sizing deployment of AI model according to service requirements,this paper proposes a right-sizing deployment strategy of AI model based on DRL.Firstly,according to the current network resource status and service requirements,the appropriate branch model and partition deployment strategy are selected,and then network resource allocation and routing are carried out,and finally the AI inference task is completed.The simulation results show that the proposed algorithm can achieve the balance between processing delay and inference precision under the premise of satisfying the service requirements in dynamic network scenarios.
Keywords/Search Tags:edge computing optical network, AI model deployment, resource allocation, deep reinforcement learning
PDF Full Text Request
Related items