Font Size: a A A

Design And Implementation Of Big Data Cloud Storage Platform Based On Hadoop And SSM

Posted on:2019-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:S L YuanFull Text:PDF
GTID:2428330566483400Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Cloud computing has gradually become the focus of attention at home and abroad in recent years.When the core of computation and processing in cloud computing system is massive data storage,cloud computing is evolving into a cloud storage.With the rapid development of cloud computing,cloud storage has become the most popular research field at present.Cloud storage,as a current new service,stores the data of the user to the cloud server.Users only need to log in to the cloud storage service system through the network,and can check and add their own files anytime and anywhere,and no longer have to worry about the loss of data.Hadoop is an open source distributed computing platform developed by Apache.In the aspects of distributed computing and data storage,Hadoop shows excellent performance,and has aroused the high attention of well-known IT companies at home and abroad.Large companies and scientific research institutions have invested a lot of human and material resources in research,making the application of Hadoop more and more widely in cloud computing and cloud storage.Hadoop includes a HDFS distributed file system.HDFS has strong data storage capability,and is especially suitable for use as cloud storage cluster.But HDFS has some defects in design and performance deficiencies.Therefore,we must improve it if we want to popularize HDFS in a large scale.This paper mainly studies the cloud storage model based on HDFS,and improves the large data cloud storage platform based on HDFS in cloud data storage,security and concurrency performance.Finally,we build a high availability large data cloud storage platform based on HDFS and the current popular SSM server side development framework.This paper is mainly divided into four parts,which are client,transport layer,request processing system and cloud storage cluster.The client is a tool for the user to operate the large data cloud storage platform directly;the transport layer provides a secure encrypted way to transmit the file.The request processing system is a back-end system,receives the request from the user and operates down the HDFS;the file stores the physical media directly in the cloud storage cluster and provides the mass data storage,which is processed with the request processing.System docking.The main works and characteristics of this paper about big data cloud storage platform are as follows:First,cloud storage cluster is built by Hadoop,at the same time,backup metadata nodes are added to form a federal structure.The metadata of HDFS is stored on namenode nodes,and HDFS is generally only a single namenode node,so the performance,storage capacity,and reliability of the whole HDFS are limited by a single namenode.Even if namenode is down,the HDFS distributed file system will not function properly.So we need to improve the namenode of HDFS,add a backup_namenode backup node to improve the reliability of HDFS.Two,the client adds a file system filter driver encryption mechanism.The files stored on the HDFS are stored according to a certain algorithm and then stored in a number of specified size files.In other words,HDFS is used to store the files in a plaintext.So if HDFS is attacked by hackers,causing user data to leak,the consequences will be unthinkable.So we need to add a layer of encryption mechanism on the basis of the original HDFS to encrypt the files stored on HDFS,so as to improve the security of HDFS.The three is the non blocking IO transmission using the Netty framework on the client side and the request processing system.Unlike traditional cloud storage systems,the platform we designed this time will use the non blocking IO mode supported by the Netty framework to transmit files,better than blocking IO,and save the resources of the system threads.One of the problems that the cloud storage system needs to consider is the concurrency of the whole system.When the user requests the same number,it will seriously restrict the performance and market development of the cloud storage system.This paper will use non blocking IO to enhance concurrency of big data cloud storage platform.Four,transport layer transfers files using HTTPS secure network transmission protocol.HPPTS security protocol is the most popular and highly secure network transport protocol in the IT industry.Because,based on the second part,we will further use the HTTPS protocol to enhance the security of our big data cloud storage platform.Five,use SSM+Netty+Shiro framework to build request processing system.The request processing system uses SSM to achieve the purpose of rapid construction,and at the same time,it also reduces most of the tedious problems.In this way,we can achieve the requirement of processing client requests.The Shiro framework is used to authenticated user rights.The large data cloud storage platform researched in this paper has user level privileges.It provides different users with different levels of file security.In addition,combined with the non blocking IO mentioned in the second part,we achieve high concurrency requirement of request processing system.In the last part of this paper,a lot of experimental verification is carried out,and the original HDFS cloud storage system is compared with the improved scheme.The experimental results show that the improved scheme proposed in this paper has a better effect and can transmit the performance of HDFS.The cloud storage cluster built by improved Hadoop is used to develop Web application program,and cloud storage platform is simulated through B/S module to realize related functions of cloud storage.
Keywords/Search Tags:cloud storage, Hadoop, distributed file system, SSM, non blocking IO, Shiro
PDF Full Text Request
Related items