| In the era of internet, the amount of data grows in explosive way. People face with opportunities and challenges at the same time. Now people are constantly digging out useful information from the big data mine, on the other hand, they may face a lot of redundant information helplessly. In this situation, as a tool helping people search for information from the mass of web pages, the search engine, is under the pressure from data growing.However, with the growth of amount of data search engines tend to make the index bigger and the search task harder. In fact, the vast majority of the information crawled by search engines is irrelevant to the user needs. Making effort to let the search engines analyze search user’s search needs, will be able to provide users with better search service experience, and save unnecessary search calculations. So the search engine users’ search needs have got the attention of domestic and foreign scholars. To complete the prediction of search requirements, we must identify the user’s search term, such identification usually require some means of web log mining.But now the search log data is in TB levels, it is hard to analyze all the data on a single computing node.The thinking of this paper is to compare the search words and historical log,searching the training of history log to get the pattern to identify the users’ searching needs. But now because the search log data is in TB levels, the training is difficult to achieve on a single computer.According to the characteristics of big data analyze, this paper presents a distributed parallel program called Paratemp. With the Map-Reduce technology in distributed cluster we excavated representative classification templates. Using association rules we learn confidence and support, to study the selection criteria for the template. The template which is selected can be used as basis for classification of search needs.After the extraction of the search template, we need an efficient natural language algorithm for the matching of search terms and new templates. This paper designs Tempaser recognition algorithm, using the Trie tree thinking, consume more space to accelerate the computing, and recognize the search template. The final experiment proves correctness and efficiency of Paratemp programs and Temparser algorithm.Finally, we summarize research result and analyze the future study. |