| At present, the development of the Internet industry is obvious to people.Led by the BAT Internet Companies to compete,seize every piece of the Internet market.For example, in recent years is the fire of O2O(online to offline, line of goods, online transactions),2014 Chinese new year very red micro channel red,baidu large numbers migration.the Internet is changing people’s life deeply. In the world of Internet,there is a piece of the market is very important.It is the entrance of the Internet.the Internet traffic accounted for more than 80%.This is the search engine.If we want to go to the website,most people will choose to jump through the search behavior.Because the Internet Web site is really too much.Webpage number is even more terrible.Google statistics is 10000000000.In conclusion,the search engine directly determines people’s Internet life.The distribution of all Internet traffic have a great impact on the entire Internet ecosystem.This paper briefly introduces the background and significance of the search engine, explains the significance of the research of search engine.The dynamic of search engine at home and abroad focuse on the Google and Baidu.This is two of the most successful search engines.Their status is very important to the development of eco search.Finally, it introduces the situation,deficiency of the search engine algorithm and put forward the solution.The main search engine at this stage is divided into the climbing parts and the sorting part.The second chapter focuses on the stage of the reptiles strategy and sort strategy,these tactics are a lot of personal use after some of the feelings.Through the analysis of the algorithm, a new algorithm is presented: the parallel idea of the breadth first search and the Ranking strategy of scoring and Ranking strategy is presented.Single from the search algorithm to look at the search is not a real search engine.In the third chapter through the internship in the company’s search engine,learned a real search engine look like.the simplex algorithm is just a small part of the search engine.Proposed by the product strategy to enhance the search results and has carried on the detailed introduction and analysis.The fourth chapter is mainly for the second chapter of the proposed two improved algorithms for experiment verification.Through integrated nutch,crawler,Solr server(mainstream search engine data storage server),Tomcat monitor, C hinese word segmentation,the front page and control in the Linux system,to achieve a real search engine and modify a lot of code in them,realize the breadth first parallel algorithm and page Ranking strategy,by analyzing the experimental data.It is proved that for the crawling efficiency and Ranking methods.Finally,the direction of the search engine is briefly introduced and a relatively good vision is given.Let people enjoy better search service.Search engines still have a lot of room for development.Many technologies are not implemented at this stage, such as personalized search,intelligent search, etc.. |