Research On Performance Optimization Technology Of Namenode Based On HDFS

Posted on:2016-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:M N Li

Full Text:PDF

GTID:2298330467489643

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Now is the era of big data. Faced with the large mounts of increasing and diversified data,the traditional data storage technology has been unable to meet the demand of such big datastorage. With the advent of Hadoop distributed file system, the problem of big data storage hasbeen solved. Because Hadoop distributed file system HDFS(Hadoop Distributed File System)uses One-Master Multi-Slaves architecture, it has NameNode single point of failure problem;and massive small files storage will reduce the storage performance of NameNode seriously,meanwhile it causes the memory bottleneck problem of NameNode. Based on research onperformance optimization of NameNode, it is of great exploration value and practicalsignificance to solve the big data processing and storage problem.The paper makes a deep analysis and research on performance optimization ofNameNode. To solve the question that single node invalidation of NameNode, this paper usesthe MN-BH distributed file system structure, and further optimizes the original cloud storageplatform. If the NameNode server goes down, another standby NameNode server can bestarted timely, ensure the normal service of the Hadoop cluster. In order to improve the storageperformance of NameNode, to solve the question that the single point memory bottlenecks,this paper proposes small files storage optimization algorithm based on HSFM. In processlayer, uploaded files are processed, a huge number of small files are merged into one big file,and then it is stored persistently in each DataNode node, the single point memory bottlenecksproblem caused by small files can be solved. The algorithm can reduce the memory burden ofthe NameNode server effectively, improves the read and write performance of NameNodegreatly.After analyzing performance optimization of NameNode, this paper gives the detaileddesign and implementation. Finally, test the optimized Hadoop distributed file system,simulate a failure on the master server and make a switch to the standby NameNode server, nofiles are lost in the HDFS, ensure that the whole Hadoop server cluster works accurately andcredibly, the test has achieved the expected effect. In order to test the optimized performanceof NameNode, three sets of experiments are designed. These are the NameNode memoryfootprint test, small files storage performance test, small files read performance test. Theexperimental results show that the optimum design can greatly reduce the NameNode memory footprint. The read and write speed is three times faster than its former. Through analysis ofthe experimental data, the results has achieved the desired test effect.

Keywords/Search Tags:

HDFS, NameNode, Small files, Distributed file system

PDF Full Text Request

Related items

1	The Research Of HDFS Optimization Towards Lots Of Small Files Accessing And Storage
2	Research And Optimization Of Storage Mechanism In Hadoop Distributed File System
3	The Research Of Increase The IO Speed Of Small Files In HDFS
4	Research And Implementation Of Mass Small File Storage System Based On HDFS
5	The Design And Implementation Of Massive Small Files Storage System Based On HDFS
6	A Strategy To Deal With Massive Small Files In Hadoop Distributed File Systems
7	Research And Implementation Of Small File Optimization Storage Management System Based On HDFS
8	Research And Implementation Of Small File Storage Model Based On HDFS
9	Optimization And Implementation Of Small File Storage In HDFS Under Hadoop Platform
10	Processing Of Small Files Based On HDFS And Optimization And Improvement Of The Performance For Mapreduce Computing Model