Font Size: a A A

Research On Big Data Processing Of Internet Of Vehicles Based On In-Memory Computing In Cloud Environment

Posted on:2018-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:W J WangFull Text:PDF
GTID:2382330596968733Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularization of intelligent devices and the continuous development of Internet technology,a large amount of information has been poured into people's lives.How to integrate these massive data to process and analyze them fast,and how to extract the required information from them have become a hot area in the field of data research.The development of machine learning and data mining techniques makes the data analysis is no longer limited to the surface,through analyzing,training or learning of data,the implicit relationship between them has become a part of the value of data themselves.As well as,the data processing platforms which make faster process have been evolved.The emergence of Big Data has brought a disruptive technological change for the enterprises,in the process of processing large scale datasets,the data processing speed and accuracy of the results have made a breakthrough.So many enterprises and experts are attracted because of the economic and cultural benefits based on the Big Data analysis.This thesis introduces the data characters of various types and large volume about Internet of Vehicles and it can satisfy the users while applying the Big Data platforms to the Internet of Vehicles.We also discusses the reliability and high efficiency if the Internet of Vehicles applications are deployed on the Spark in the cloud environment.One of the core research objects of this thesis is the datasets of electric vehicle LF620 Internet of Vehicles system with the driving characters of route unchanged and similar driving time.This thesis proposes a vehicle congestion level classification strategy based on the background and an electric vehicle power model.We implement the logistic regression algorithm and its extension based on these datasets and use the classification model to help to predict the vehicle remaining driving range.This thesis focus on the data processing of LF620 Internet of Vehicles and the optimization of Spark.Firstly,analyzing the theoretical knowledge of logistic regression algorithm in Spark MLlib library and discussing the costly storage overhead as well as the limitation of the accuracy of this classification model.In order to save the overhead to apply the classification model to the data analysis in the Internet of Vehicle,Softmax function is imported into logistic regression algorithm in Spark MLlib library.The experimental validation shows that the extensional logistic regression algorithm has a better accuracy.Finally,this thesis studies the optimization of Spark.First of all,studying on the data serialization and comparing the difference of application execution time between the default serialization mode and Kryo serialization mode.Secondly,proposing an optimization strategy on cache replacement policy for Spark memory.From the experiment result,it shows that the above policies effectively improve the RDD cache hit rate and the memory utilization,in which case,it also improve the data processing speed in Spark.Finally,analyzing the job scheduler in the process of job partition and submission to get the directed acyclic graph of tasks.According to the monitor code,calculating out those RDDs which need to be cached in Spark memory to save the input/output time from disk,in which case,the data processing speed in Spark has been improved.
Keywords/Search Tags:Internet of Vehicles, Spark, In-Memory Computing, Big Data, Logistic Regression
PDF Full Text Request
Related items