| As a great amount of information expands rapidly in the internet,people will get a large number of redundant information.When people browse this information,they will waste a lot of time and energy.Therefore,someone proposed automatic summary technology to find important parts of the information.Most similar or identical fragments will appear because of the multiple documents under same topic.Even if these documents are compressed and duplicated,people still cannot solve the problem of too much information.Therefore,the multi-document automatic summarization technology came into being.It can be used to extract the information of multiple documents and solve the problem.The traditional multi-document summarization method cannot solve the problem of information redundancy.And it is less readable.This paper use the self-built multi-document data set to propose a deep learning method based on multi-feature fusion,Convolutional Neural Networks(CNN)and Gate Recurrent Unit(GRU).It makes the text vector contain enough semantic information.It improves the accuracy of the generated text summarization.Specifically,this article mainly does the following work:(1)We obtained long-text news webpage data of 30 different themes from People's Daily and Sina.com.After cleaning and preprocessing the data,we constructed multi-document automatic summarization data set.It can be used as a bedding method in this paper.(2)We proposed a multi-document automatic summarization method based on multi-feature fusion.First,we used Han LP to perform Chinese word segmentation on multi-document data sets,and then we used Word2 Vec to train the word vector model.We used LDA,cosine similarity,Text Rank,sentence length and sentence position.According to the final sentence weights dynamically obtained by the multiple linear regression method,we eliminated the redundancy based on the MMR and LD.Experimental results show that this method performs well in the ROUGE evaluation system and can effectively help users find valuable text information.(3)We proposed a multi-document automatic summarization method based on CNN and GRU.First,we built a Sequence to Sequence(Seq2Seq)model with attention mechanism,and then put 83,000 Sohu text data into the deep learning model for training.The encoder uses CNN and GRU,and the decoder uses GRU.Finally,we adjusted the parameters in reverse to improve the robustness of the model.Experimental results show that this method performs well in the ROUGE evaluation system.In addition,the method was also tested on the DUC2004 data set.The results show that the method is also effective for English multi-documents.(4)This article combines traditional methods and deep learning methods.We have implemented a Chinese multi-document automatic summarization system based on the deep learning method of Multi-feature fusion,CNN and GRU(M-C-G).And the system presents multi-document automatic summarization technology to users in a more intuitive way. |