Font Size: a A A

Research And Implementation Of AFC System's Real-time Data Distributed Processing

Posted on:2020-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y WeiFull Text:PDF
GTID:2392330572473557Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous expansion of urban rail transit scale and the improvement of informatization level,the automation of ticket sales and checking is becoming more and more automated.The increasing volume of transactions and users has led to a sharp increase in the amount of data generated by the Automatic Fare Collection system in real time.There is an urgent need to upgrade and optimize the operation and management method of traditional operation system,for the reason that the system based on relational database storage framework is unable to meet the needs of massive data storage and real-time data processing.In recent years,with the rapid development of cloud computing and Internet technologies,how to apply big data technology to the industry has received extensive attention.This paper carries out in-depth research and study on the principles,key technologies and related applications of mainstream big data real-time stream processing and distributed storage at this stage.On this basis,a good solution for real-time data access,processing and storage of AFC system is proposed.Facing the rapid increase of data,the traditional urban rail transit operation system encounters bottlenecks in storage and calculation.In this context,this paper deeply studies the existing open source distributed storage and real-time computing framework.According to the real-time data characteristics and practical application requirements of AFC system,the problems that need to be solved by using big data distributed technology to process real-time data of AFC system are proposed.As the original operating system cannot be affected to provide services in the actual development environment,this paper designes a data access and analysis scheme based on Canal to synchronize the data of the original storage system to the big data distributed environment in real time and provide flow data for the subsequent real-time processing.In view of the problem of small throughput and poor real-time performance of the original operating system,the Kafka message middleware and Spark Streaming processing flow are deeply studied.Combined with the real-time data characteristics and business requirements of the AFC system,the offline task is used as the data foundation for real-time computing and processing.Data preprocessing and connection operations are advanced,and the performance impact of Spark Streaming computing tasks is analyzed and optimized from the aspects of operator performance,parallelism and load balancing,and the efficiency of data analysis and processing is improved.In view of the problems of capacity and expansion of traditional relational databases,the storage architecture and principle of HBase database and Redis database are deeply studied,and the storage table structure,Rowkey and pre-partition of AFC system data are designed.In order to further improve HBase data storage and query performance,HBase data query caching scheme is designed based on Redis,and cluster parameter configuration is optimized.
Keywords/Search Tags:AFC, streaming data, Spark Streaming, real-time computing, HBase
PDF Full Text Request
Related items