Font Size: a A A

Machine Traffic Recognition Based On Machine Learning

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:X XieFull Text:PDF
GTID:2428330626456005Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet scale,there are more and more non-human traffic generated by automation programs,that is,machine traffic.Some of them are spider programs of search engines,which can help us improve search efficiency and search quality.Some are crawlers that query and capture data from the website in batches,so as to get resources on the website at low cost.Some are browser programs that imitate human behavior of browsing web pages,and bring commercial benefits for themselves through fraudulent behavior.Worse machine traffic is the computer virus programs that use the Internet to attack individuals,enterprises and government units.It will bring great harm to the society.As computers and the Internet play an important role in people's daily life,machine traffic has a significant impact on individuals,enterprises and governments.Through the accurate identification of machine traffic,we can make more reasonable use of the Internet and prevent its harms,and also make our understanding of the Internet clearer.In recent years,machine traffic accounts for about half of the total Internet traffic.On the Internet,there are machine traffic that is good for human beings and bad for human beings.Each of these two types of machine traffic accounts for about half of the total machine traffic,so the impact of these two types of machine traffic cannot be ignored.Identifying good and bad machine traffic from machine traffic can help us to make good use of good traffic and take preventive measures against bad traffic,so as to protect website data and website security.Therefore,it is urgent and significant to study the identification and classification of machine traffic.In this paper,we use massive web logs to study the application of different machine learning algorithms in machine traffic identification.The specific research work is as follows:A preprocessing method of Web log is proposed.Aiming at many fields in the Web log,the fields related to the research of machine traffic identification are screened out.Through the design of data preprocessing module,the processed single log data is associated with multiple original log data,and the problem of single log data identification is solved.Comparing the advantages and disadvantages of different machine learning algorithms,an improved Elastic Pooling-Convolutional Neural Network(EP-CNN)based on the traditional Convolutional Neural Network(CNN)is proposed,designed and implemented the classification model of machine traffic and human traffic based on deep learning,and further implemented the classification model of good machine traffic and bad machine traffic,through a large number of experiments to get the best parameter settings of these models.The experimental results show that the recognition algorithm based on deep learning generally performs better than the recognition algorithm based on traditional machine learning,and can achieve higher performance indicators,among which EP-CNN has the best comprehensive performance,high accuracy and low training cost.Finally,summarize the research content reduced to,and analyze,summarize and prospect the follow-up research work.
Keywords/Search Tags:Machine traffic, CNN, Web logs, EP-CNN, Machine learning
PDF Full Text Request
Related items