| In the field of artificial intelligence,the development of deep learning technology has opened up a new solution in the field of system log anomaly detection.The biggest advantage of deep learning is that it can mine the behavior pattern of the system from the host log,and then train the deep learning model repeatedly.At present,a large number of deep learning technologies have been widely used in the field of host system anomaly detection.However,due to the variety of types and formats of host logs,problems such as long running time of the overall model and low recognition accuracy have arisen,which also makes the application of deep learning in the host log anomaly detection technology face great challenges.In order to solve the problems existing in the existing host system log detection,this thesis has carried out the research work of host system log anomaly detection from two areas of supervised and unsupervised learning.First,this thesis proposes a host log anomaly detection method based on deep self-coding AE.In this method,the log data is parsed and sampled,and the number of hidden layers of automatic encoder is increased,Thus,the higher dimension log information feature detection is realized.At the same time,this thesis proposes a head-based enhanced masking language model,and uses the target pre-training task to efficiently extract the distribution characteristics of the system log statements,which overcomes the problems of traditional log analysis algorithms such as data discarding,simple feature extraction,and temporal information being ignored.Finally,this thesis also discusses the application strategies of sequence generation and classification,introduces two strategies in the generator and discriminator network,and designs a generation antagonism framework based on longdistance dependence.This method improves the distribution matching of singlecategory data under semi-supervised condition,and effectively solves the problem of unsatisfactory detection results caused by unbalanced classification.In the experiment part,this thesis verifies the proposed method can improve the performance of anomaly detection in the semi-supervised learning environment by verifying two sets of data sets in the real environment.The obtained results fully prove that the F1 score of the log anomaly detection method based on semantic recognition and generation antagonism network is 90.6%,which is 27.8%higher than the current mainstream scheme. |