| Based on the mobile phone data,this thesis studies the method about cleaning,then analyzes the urban residents' mobility pattern,the job-housing relationship and the changes in the flow of people.We also design and implement a mobile phone data analysis engine.The main content of this thesis is as follows:First of all,this thesis analyzes the characteristics of mobile phone data and studies the method about cleaning of the mobile phone data.For the invalid data,ping-pong data and drift data existing in the mobile phone data,we analyze its couse and design the corresponding cleaning algorithm.The Clustering-based and LOF Outlier Detection Method,CLOF,which combines the outlier detection algorithm with the K-means algorithm,is adopted to clean the noise data in order to improve the accuracy of stay-point detection.Secondly,based on the mobile phone data,this thesis analyzes the residents' mobility pattern,the job-housing relationship and the changes in the flow of people.In terms of residents' mobility pattern,this thesis uses the spatial-temporal resident point extraction algorithm based on DBSCAN to extract stop-points,and then mines the residents' mobile travel pattern information.In terms of the city's job-housing relationship,this thesis detects the home and working areas of mobile phone users.Obtain the distribution of job-housing,and combine mobile phone data of Wuhu City to analyze the relationship between the four cities and four distincts in Wuhu City and the commuting distance.In terms of changes in the flow of people,this thesis counts the flow of people in block dimension and analyzes the flow of hot spots based on mobile phone data of Wuhu City.Finally,this thesis designs and implements a mobile phone data analysis engine.Based on the requirements analysis of the mobile phone data analysis engine,the engine framework is given and the engine is divided into six levels.Then,based on the MapReduce programming model provided by Hadoop and Spark,the distributed data mining algorithms are implemented.This thesis designs APIs of each module,uses the Flask framework to build the server,and provides API interfaces.In the hardware and software platform test environment,functional testing and engine speed testing are performed on the mobile phone data analysis engine to verify the engine's functional and performance requirements. |