Many crimes are occurring around the world and increasing year by year,it is challenging for every country to reduce the crimes.According to the Global Terrorism Dataset developed by Maryland University,the crimes have increased obviously and more frequently during the past twenty years.Before twenty years,it may be either the lack of the technology to collect the crimes or the crimes were actually not taking place as much as the crimes during that twenty periods.Nevertheless,nowadays,it is very critical and threaten issues for human community to reduce the crimes as much as possible to enhance the life quality of the citizens and create the peaceful,sustainable and enjoyable environment.With the development of technology,the data has been collecting increasingly and enormously in digital form,including the crime data.This provides the discovering of the criminal patterns and analyzing the characteristics of these criminal behaviors.Determining the potential patterns and relationships between these crimes data can provide the effective prevention and efficient control of various types of criminal behaviors.In accordance with the evaluation of computer hardware,software,and information technology,criminal behaviors are recorded in time,which provides rich data resources for criminal behavior analysis and research.In the face of large-scale criminal behavior data,how to use Machine Learning(ML)and Artificial Intelligence(AI)techniques for effective identification and prediction of crime patterns is one of the research hotspots.At the same time,how to use High-Performance Computing(HPC)technologies such as distributed computing and cloud computing to efficiently process complex ML/AI algorithms and large-scale spatio-temporal data is another important research issue.This dissertation focuses on the application of ML algorithms in the field of criminal behavior analysis and performance optimization.We use ML algorithms and parallel computing in analyzing and predicting various types of criminal behaviors and models to promote the construction of a peaceful and safe human society.A criminal activity clustering algorithm,crime hotspot location algorithm,crime rate evaluating algorithm,crime pattern discovering system,and a crime pattern decision support system are proposed respectively.We use the Apache Spark cloud computing technology to design corresponding parallelization solutions for the proposed algorithms and systems,and effectively improve their performance.The main work and innovations of this dissertation are as follows:(1)We present an algorithm for the clustering of criminal activities based on fuzzy clustering.We analyze the large-scale historical criminal behavior data sets in the spatio-temporal format,and propose an optimized fuzzy clustering algorithm.We name the proposed algorithm as fuzzy clustering-based CAC(Criminal Activity Clustering)algorithm,which analyzes criminal activities of different periods in various countries/regions,and construct a time series-based criminal behavior model.On this basis,to improve the performance of the proposed CAC algorithm,we develop a parallelization solution based on the Apache Spark cloud computing platform,where the computing tasks in the CAC algorithm are decomposed and scheduled in parallel.Experimental results show that the proposed CAC algorithm and its parallel version can efficiently cluster criminal behaviors on large-scale time-space data sets and obtain accurate clustering results.(2)We propose a high-efficiency crime pattern discovery(CPD)system using cloud computing.Based on the proposed CAC algorithm,we further propose a Crime Rate Evaluation(CRE)algorithm and a Criminal Hotspot Locating(CHL)algorithm.In the CRE algorithm,we use statistical theory to analyze the clustering results of each criminal clusters,and evaluate the crime rates of various countries/ regions,different criminal behavior categories,and target types.In the CHL algorithm,based on the time series data prediction technology,we predict the potential high-risk areas of different types of crimes in different periods for numerous countries and regions,and provide the corresponding prevention suggestions.To improve the performance of the CPD system,we design a parallelization scheme of the system and deploy it on the Apache Spark platform.Experimental results show that the proposed CPD system can efficiently analyze large-scale criminal activity data,and accurately identify the crime rate and hotspots of each type of criminal activities,providing a scientific basis for crime prevention.(3)We research the data mining of criminal patterns and their application in the field of decision web-based support for travelers.We design and build a decision support system for traveler based on crime pattern discovery(CPD-DSST),so that travelers can understand the crime situation for a specific area and obtain timely advice to ensure travel safety.To discover and locate criminal behaviors,we propose a Crime Classifying,Discovering and Locating(CCDL)algorithm based on multi nomial logistic regression and CRE algorithm using spatio-temporal crime data.It can effectively locate and classify criminal behaviors and suggest the travelers to make the decision.Experiments show that the system can discover the criminal behaviours for the specific location and the selected time,and preliminary feedback from the trial operation of the system shows that the system’s functions are appreciated by users.(4)We research the application of machine learning and neural network algorithms in fingerprint acquisition,detection,classification,recognition and verification.We provide the latest literature evaluation of fingerprint classification algorithms and their applications in the field of criminal investigation.We analyze the related machine learning and neural network algorithms from the aspects of fingerprint classification,fingerprint matching,feature extraction,fingerprint and finger vein recognition,and fraud detection.We also discuss the challenges in current fingerprint analysis methods and applications,as well as future development directions.The work of this dissertation is rich of theoretical value and practical significance in crime investigation,prevention,and discovery of crime patterns.It makes full use of the Apache Spark cloud computing platform for distributed and parallel computing to improve the performance of scalable parallel ML algorithms.In addition,this work explored the application of these algorithms in the field of criminal investigation and fingerprint recognition,and laid a solid foundation for the promotion of applications in other fields. |