Font Size: a A A

Design And Implementation Of Oil Big Data Platform Based On Hadoop

Posted on:2020-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiFull Text:PDF
GTID:2381330590983185Subject:Computer technology
Abstract/Summary:PDF Full Text Request
A variety of collection equipment for petroleum in a province has a large number and different formats.In order to store the collected data of various petroleum stations in a unified manner,a big data storage center is needed.Big data storage centers must meet storage requirements in terms of capacity and can store masses of oil data in the future.In terms of data input,the storage center must meet the requirements of the province's data input traffic.At the same time,it must have an efficient and massive data search function.In addition,it is necessary to have the corresponding system monitoring function to provide convenient monitoring functions for managers,which can monitor the real-time IO of the system,real-time network load,real-time resource usage rate,etc.In terms of security,there are corresponding measures to prevent data leakage.In system management,there must be supporting management functions such as node addition,storage expansion,and node migration.In order to solve the above problems,this paper builds a petroleum big data platform based on Hadoop distributed platform to store,manage and query oil big data.The HDFSbased distributed storage system is used to solve the storage problem of massive data,and has the capability of multiple copies and dynamic capacity expansion.In terms of data input,Kafka-based distributed message queue is used to ensure data security and efficiency.In the aspect of data query,the Hbase database is used to achieve efficient query of massive data.Besides,the monitoring function,management function and security function are used to manage the big data storage center by using the CM-based management platform.Finally,the implementation of the oil big data platform is tested.The system can achieve input performance of 40,000 data per second,far exceeding the actual peak traffic.It can store 8 billion pieces of data with only 4 servers,and each piece of data is 1 kb.There is also no problem with big data storage.In the query test,we conduct experiments when the data volume is 30 and 8 billion pieces of data,and the results show that the query was very efficient,and the delay guarantee is within an acceptable range.In addition,various monitoring functions,management functions,and safety functions are tested and all operate normally.
Keywords/Search Tags:Hadoop, Distributed Systems, oil, Big Data
PDF Full Text Request
Related items