Font Size: a A A

The Research On Full-Text Retrieval Of Big Data In Public Security Based On Elasticsearch

Posted on:2019-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhuFull Text:PDF
GTID:2416330542455545Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
There is no doubt that the era of big data has come."The core competition in twentyfirst Century is the competition of data." said Jack Ma,the founder of the Alibaba group.Like the agricultural revolution and industrial revolution,data revolution has been exerting a subtle influence on the traditional way of people's life,study and work,and has become a continuous driving force for social development.There is no doubt that the upsurge of big data has swept the small and medium technology companies and has come to the ground.However,due to the insensitivity to new technologies and the lack of core technology practice,the government departments at all levels in China have not yet fully realized the great convenience brought by big data to government affairs.In the key nodes of technological innovation,we should comprehensively and deeply study the connotation and core technology of big data.It is of great practical significance to build a data driven public safety society based on the good momentum of public security information construction at this stage.Firstly,this thesis analyzes the objective development needs and existing problems of public security business in the era of big data,and puts forward targeted solutions and Countermeasures Based on related technologies and system functions and needs.Then the specific functions of the system are designed and implemented in contrast to the corresponding requirements.The contents of the thesis are as follows:1.Research on key algorithms of full-text retrieval technology and propose some improved algorithms of it.After the analysis of the key algorithms of the conventional Chinese word segmentation,this thesis proposes an improved scheme for the use of the DT participle based on double array Trie and Edge Ngram algorithm in this system.After the analysis of the inverted index algorithm,the improvement measures are put forward with the multiplication algorithm.After the analysis of the algorithm and the scoring algorithm,the paper puts forward the proposed algorithm.An improved sorting algorithm based on LTR.2.Design of a new full text retrieval system.A new full text retrieval system is designed by using a variety of distributed technologies and the improved word segmentation algorithm,inverted index algorithm and sorting algorithm,which includes the data storage module with the use of Hadoop and HBase,the data retrieval module combined DT tokenizer,LTR and Elasticsearch,and the data synchronization module which proves the consistency of data.3.The application of the new full-text retrieval on massive public security business data.The related efficient retrieval algorithms are applied to the practical application scenario,and the implementation of each module is elaborated in detail,and the performance test and result analysis of the system are completed.The experiment proves that the system can deal with the storage and retrieval of massive public security data in large data environment.The system has high availability,extensibility and efficient data reading and writing and query efficiency.
Keywords/Search Tags:Big Data, Public Security, Information retrieval, Elasticsearch, Learning to Rank
PDF Full Text Request
Related items