Font Size: a A A

Design And Implementation Of Baidu Duoku Mobile Game Data Platform Based On HDFS

Posted on:2019-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:S L XuFull Text:PDF
GTID:2428330545472159Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Duoku Game has been a department of Baidu.Since the previous business is not satisfied with the growing demand,leadership of Duoku Game decided to rearrange the company's business and rebuild a clearer and succinct work scheduling platform and data display platform.This thesis mainly realizes the ETL and the function that display data of the current or previous business from Baidu Game Department's various channels.Since the expansion of Baidu's gaming business,the existing clusters and technologies are not enough to deal with more and more data which is in TB,especially after the independence.And the business of company should be more independent from each other.With the increasing games from various channels,the improving business,the cooperation with the outside,the more data,and the more division of data,the requirement of the speed of dealing with data and the obtaining of data is increasing.Therefore,we plan to set up a new data planform using new technology.We hope the system could meet the operators of various channel games and variety of data analysis requirements.In the part of data processing,data storing,data displaying,we decide to use Hadoop and Spark which are the most popular technology.The final report will be stored in MySQL.And we decide to use PHP and CI framework to build the front-end and back-end for building quickly.For this part,this thesis pick up two module to represent it which are the function Today Overview and Real-Time Overview.In the module Real-Time Searching and Visual Operation of HDFS and Hive,we use thrift framework to make the realization of front-end and back-end doesn't rely on the SparkSQL.We define all of the servers which this module need in thrift server.So the back-end just need to call those servers.For this module,we decide to use Tomcat to build the front-end and back-end.Because the computing resources are not infinite,so we need to optimize the etl flow to prevent blocking of clusters.We want to make an average distribution of the amount of calculation of all working unit during 1 o'clock to 9 o'clock.So we decide to use Genetic Algorithm to deal with it.We want to calculate the best flow every day if there is any different from the yeasterday's etl flow.Now the platform is in use.Platform provide the basic data of all games and much analysis in different dimisions for the operators and leaders.And platform got a good notice.The real-time searching module improving much efficiency of data development.And the ETL optimizing module improving the using rate of clusters.
Keywords/Search Tags:Mobile Game, Data Platform, HDFS, ETL Improvement
PDF Full Text Request
Related items