| With the increasing popularization of Internet and the rapid development of information technology,the data generated by Internet is exploding,and the storage and processing of mass information has become a problem that every enterprise must pay attention to.At present,Hadoop is a mainstream open source big data distributed platform.Its distributed file system(HDFS)can complete the data storage of PB level,and the MapReduce programming model can complete large scale offline data processing.However,MapReduce framework will read and write disk multiple times,resulting in delay of operation and increase of cost,and MapReduce framework is not suitable for DAG operations and iterative operations.What’s more,processing the flow of data generated by online system needs low latency and high reliable processing technology.MapReduce programming model can not meet the needs of real-time monitoring and statistical analysis.At the same time,the new business model based on big data will continue to emerge.Applications based on Hadoop will expand from the Internet to finance,bio pharmaceutical,e-commerce and other fields.Big data platforms need to be popularized in more and more users who have a little knowledge in Hadoop.However,most of the large data distributed systems are designed for special data analysis personnel,so they only support querying HDFS or database directly,with no data visualization module.For people who have a little experience in data analysis,it may be difficult in using the platform,and understanding the results simply by reading a file.In view of the problems above,this thesis designs and realizes a data processing and data visualization processing platform based on Eole system.The platform uses B/S architecture.The data processing module based on Hadoop platform,adopts Spark memory computing framework and Spark Streaming to achieve efficient offline computing and real-time computing.Using NoSql HBase database as the main storage can provide Eole system a stable service under high concurrency.At the same time,the system has realized the visualization of the calculation result by ECharts chart drawing tool.User can complete data management,data processing,graphics display and other operations through the browser,greatly improving user’s work efficiency. |