Font Size: a A A

Research On End-to-end Congestion Control Mechanism For LEO Satellite Networks

Posted on:2024-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2568306941489114Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of satellite-based network applications and services,the relative lack of bandwidth in existing satellite networks has become an important practical problem limiting the development of satellite networks.Along with the in-depth research on deep reinforcement learning in recent years,its decision-making capability has shown great advantages.Compared with the limitations of traditional congestion control algorithms,which are based on rules to determine whether congestion occurs or not,resulting in incomplete determination of congestion occurrence and waste of current network resources,the learning mode of deep reinforcement learning,in which an intelligent body interacts with the environment and dynamically improves its own action strategy,has great advantages in the improvement of congestion control,especially in the highly dynamic and high latency LEO satellite network.In particular,under the highly dynamic and high latency LEO satellite network,the control of network congestion state by trained intelligences is more flexible and better adaptable than the traditional congestion control mechanism.Based on the above ideas,this thesis proposes the TCP-SemiRL algorithm with environment detection,dynamic intervention and deep reinforcement learning,which uses online training to generate action policies suitable for the current network to dynamically control the congestion window.The intelligent body system is used to detect the optimal congestion window for the current environment in order to maximize the utilization of current network resources while achieving the dynamic balance of sending and receiving to fundamentally avoid congestion and achieve optimal performance.Compared with traditional congestion control algorithms and TCP-RL based on deep reinforcement learning,TCP-SemiRL can achieve a significant increase in throughput while ensuring a smooth congestion window and a smooth RTT,which meets the performance expectations of the algorithm.The main work of this thesis is as follows:(1)In order to cope with the inability of the intelligent body to recognize the environmental changes and to avoid the action strategy generated by the intelligent body does not match the environment,this thesis proposes an environmental monitoring module based on the environmental entropy mechanism to carry out environmental detection while detecting the fit of the action strategy,which solves the problem of unclear criteria for measuring the adaptation effect of the behavior strategy and the change judgment of the environment.The environmental entropy mechanism is designed by oriented TCP transmission parameters,and defined in terms of both RTT and data transfer statusand the larger the environmental entropy value represents that the behavior strategy does not conform to the current environment at this time,indicating that the more training intervention is needed.And too large environmental entropy means that the environment changes and requires restarting training to avoid the performance degradation brought by the wrong behavioral strategy.(2)To address the problem of unclear knowledge of the performance of an intelligent body,this thesis proposes a window backtracking mechanism to monitor the performance of an intelligent body in real time,directly monitor the congestion window and performance,and intuitively see whether the current policy is appropriate.The scheme uses the window backtracking mechanism as part of the basis for determining the performance of an intelligent body during the training process,while deciding together with the environment entropy whether to further interfere with the congestion window,which can effectively check whether the sequence of congestion windows in the sampling period is compatible with the current environment,and solve the problem of the instability of the environment caused by the wrong action strategy during the training process.(3)To address the problem of how to quickly form an action strategy adapted to the environment,this thesis proposes an additional Q_value value mechanism and a reward mechanism.By adding the influence of the predicted Q_value value to the Q_value value obtained from the training of the intelligent body,the scheme influences the way of the training process by combining the environment and the actual performance of the intelligent body;by adding the reward mechanism,it can directly calculate the reward obtained after executing the current action strategy to achieve the maximum efficiency transmission while maintaining stability,thus solving the action strategy adapted to the environment The problem of slow and incorrect generation of action strategies to adapt to the environment is solved.
Keywords/Search Tags:satellite network, congestion control protocol, NS3, throughput, congestion window, deep reinforcement learning, dynamic balanced
PDF Full Text Request
Related items