An Approach Of Anomaly Detection And Anomalous Microservice Ranking In Cloudnative Environments

Posted on:2023-12-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z K Zhang

Full Text:PDF

GTID:2558306767962719

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,more and more developers have started to build applications based on cloud-native architecture.The “cloud-enabled” mechanism has become a key direction of digital transformation of enterprises.For the multiple time series data and dynamic operating environment generated by the expansion of the scaled system,the traditional experience-based manual monitoring methods can no longer meet the requirements of IT operation and maintenance.In this context,AIOps has emerged,aiming to achieve efficient and low-cost IT operation using artificial intelligence technology.AIOps consists of two key scenarios: anomaly detection and root cause analysis.Anomaly detection technology detects abnormal system behavior by analyzing the intrinsic characteristics of monitoring metrics.Root cause analysis technology locates the root causes that lead to system anomalies based on fault propagation diagrams.In recent years,scholars have proposed many anomaly detection and root cause localization methods based on system monitoring metrics.These works have achieved good results but still have a few limitations.In anomaly detection scenarios,the impact of multiple metric correlations on detection results has not been explored in depth.In root cause analysis scenarios,existing methods require manual tuning of parameters for different systems.To address the above issues,the main work of this thesis includes:(1)An anomaly detection method that incorporates attention mechanism-based prediction model and i Forest is proposed.Firstly,we combine feature attention mechanism and temporal attention mechanism to explore the potentially key information in the time series data.On this basis,we establish a sequence-to-sequence prediction model,and obtain the prediction residuals by comparing the predicted and true values of the time series data.Finally,we use the prediction residuals as the input of the i Forest algorithm to dynamically adjust the anomaly threshold based on the characteristics of datasets to achieve anomaly detection.The experiments on two datasets show that the proposed anomaly detection algorithm performs better than the classical anomaly detection method.(2)For root cause analysis,an automatic ranking method of anomalous microservices is proposed based on random walk algorithm.Firstly,system-level and application-level metrics are collected to construct a service dependency graph for the cloud-native system.Subsequently,the historical response time metrics are clustered to obtain the initial anomaly weights of each microservice node.Then the anomaly weights in the service dependency graph are automatically updated according to the anomaly propagation relationship between each microservice node itself and its neighboring nodes.Finally,the personalized Page Rank random walk algorithm is used to further rank the anomalous microservices.Experiments in cloud-native environments show that the root cause analysis method proposed in this thesis can efficiently locate the root cause of anomalies,while being robust to scalable cloud-native systems.

Keywords/Search Tags:

Microservice, Cloud Native, Anomaly Detection, Root Cause Analysis, Attention Mechanism

PDF Full Text Request

Related items

1	Microservice Performance Anomaly Detection And Root Cause Localization Based On Multi-source Data
2	Design And Implementation Of The Intelligent Operation And Maintenance System For Kubernetes Container Microservices
3	Research And Implementation Of Microservice Anomaly Detection And Localization System Based On Causal Analysis
4	Research On Algorithms Of Anomaly Detection And Root Cause Analysis In AIOps
5	Performance Optimization For Cloud Native Microservice Applications
6	Anomaly Detection And Root Cause Analysis Based On Multivariate Metrics In Cloud System
7	Design And Implementation Of Fault Delimitation Algorithm Based On Microservice Call Chain
8	The Design And Research Of A Bank’s Electronic Wallet Service Platform Based On Cloud Native
9	Anomaly Detection And Diagnosis For Metrics In Cloud Services
10	Evolutionable Unsupervised Anomaly Detection And Root Cause Analysis Method For Multiple Time Series Data