| The TLS protocol is the main way to encrypt communication traffic in the current Internet.In recent years,with the increasing requirements of users for communication security and privacy protection,TLS protocol keeps iterating and updating,and its level of support and deployment has become higher and higher.Obtaining the characteristics of TLS communication in the Internet and analyzing them in multiple dimensions are of great significance for understanding the security performance of user communication in the Internet.However,there are some problems in the current TLS communication fingerprint acquisition and analysis work,such as a high degree of coupling with the business,lack of general capabilities,limited supported scenarios,focusing on the design of business rules and the use of business data and ignoring performance improvement,etc.Based on this background,This paper designs and implements a large-scale TLS communication fingerprint acquisition and analysis system,which is used to quickly obtain,efficiently store,analyze and visualize TLS communication fingerprints in the target Internet,so that it can provide technical support in the supervision work of operators and network supervisors.The work of this paper mainly includes the following parts:Firstly,the requirements analysis and outline design of the system are carried out.In addition to the overall analysis of the system requirements analysis,it also needs to be analyzed from the functional and nonfunctional aspects.Based on the above requirements analysis results,the overall architecture of the system is designed,and the three functional modules of the system are determined:TLS communication fingerprint fast acquisition module,TLS communication fingerprint efficient storage module,TLS communication fingerprint analysis and result visualization module.Secondly,based on the research and familiarity with the TLS protocol and high concurrency technologies(such as Epoll),the rapid acquisition module for TLS communication fingerprints is designed and implemented,and the number and dimensions of TLS communication fingerprints are realized by combining active measurement and passive analysis.Based on the understanding of the dimension of TLS communication fingerprint features,combined with storage technologies such as Clickhouse and Redis,the efficient storage module of TLS communication fingerprint is designed and implemented in detail,which solves the storage and query problems of fingerprint features in massive data scenarios.Based on the query and analysis of TLS communication fingerprint features combined with the VCharts component,the TLS communication fingerprint analysis and result visualization module is designed and implemented in detail,and the finegrained query results and multi-dimensional analysis results of massive TLS communication fingerprint features are displayed completely and diversely.Finally,a comprehensive test is carried out on the system,which includes the test of function-related requirements and the test of performance-related requirements.The system is used to actively detect 1.97G network communication addresses,and complete the TLS communication fingerprint acquisition of about 102.35 network communication addresses per second on average.The system is used to passively analyze the 1.39TB network communication traffic,and about 46210.72 network communication addresses are read and resolved per second.At the same time,the system writes about 193.4MB of data to the storage system per second,and the query response time for partitioned data is about 0.659 seconds,and the query response time for non-partitioned data is about 1.535 seconds,which can meet users’ functional and performance requirements. |