| With the transformation of logistics information management,stream data management analysis such as transportation vehicle monitoring,security personnel positioning,and storage temperature and humidity monitoring have all been integrated into the logistics integrated management system.The surge in stream data has made traditional data storage and analysis systems appear in performance.Serious deficiency.Therefore,it is urgent to use big data storage and processing technology to upgrade the system in all directions.This paper focuses on the architecture design of the overall big data storage system,the performance optimization of big data storage and analysis components,and the research and implementation of key technologies.First,the principle and structure of big data components for system design are introduced in detail,combined with the characteristics of the data source and data structure of logistic big data,and the advantages of various components in different use scenarios are analyzed.The problem,through researching data acquisition,data cleaning and decoding,and data storage and retrieval analysis,determine the selection of technical components.Then,based on the business needs of the actual scenario,a comprehensive logistics big data system solution was designed.The system consists of three modules:a data acquisition and cleaning decoding module,a data storage and analysis module,and a data display module.HBase is used as distributed data storage.Warehouse,Spark as a tool for data analysis and mining.Aiming at the disadvantages of HBase non-row key queries that require full table scans and the performance bottleneck of Spark's Join operation when large tables are connected,an improved method of using ElasticSearch to build secondary indexes and partition Bloom filter optimization is proposed to improve the system's real-time retrieval and data Processing power.Finally,the key technical aspects of the system are implemented in the test environment,and experimental schemes are designed to verify the improved method experimentally,which proves the feasibility,correctness,and effectiveness of the improved method in the logistics big data scenario. |