| The topic of the literature is an in-depth refinement of the content of the literature,and the heat of the topic reflects the degree of attention and importance of the research topic in this field.With the rapid growth of the number of research papers,the continuous specification of the scientific research field,and the growing significance of scientific research power,it is increasingly important to perceive and grasp the research direction of hot topics in advance.However,there are still some shortcomings in the previous hot topic recognition methods.This paper proposes a new approach to characterize the topic popularity with the sum of topic probability of the documents and integrates the machine learning method into the traditional information science problem.The research content can mainly be divided into two aspects:On the one hand,this paper has explored some potential factors of hot topics.Proceed from the published data and citation data of the literature,a helpful feature architecture has been built.The construction process of the feature architecture is based on some foundations and reasonable assumptions,for instance:(1)Due to the influence of Matthew’s effect,the achievements of well-known scholars are more likely to promote the development of the topic;(2)Fund reflects the emphasis of the topic comes from the state and units at all levels,so the value of research results supported by funds is more likely higher;(3)The more up-to-date the reference knowledge of the literature is,the more important the significance of the research achievements is;(4)The academic quality of the journal reflects the quality of the paper,and excellent papers can better promote the development of the topic,so journals in high quality which disseminate research achievements can also promote the research developments of the topic;(5)The higher science strength is,that is the more reference knowledge in the achievements,the more knowledge it absorbed and transformed,so the higher significance of its achievements is and the greater influence affected on the heat of the topic;(6)The citation of the literature also reflects the attention of practitioners in the field to the topic.On the other hand,this paper focus on the construction and adjustment of the topic heat prediction model.Through the construction of the feature architecture as the input of the model.Using three machine learning models,after the prediction of the test set,it is found that the prediction model based on the LSTM model is the best among three,and the accuracy rate is within the acceptable range.At the same time,compared with the baseline method without potential influencing factors,it is verified that the potential influencing factors proposed in this paper can can effectively assist the prediction and significantly reduce the prediction error.Afterwards,the influence of the characteristic time window on the prediction error is also verified.The results show that a too short time window will produce a larger error,and an excessively long time window will not significantly improve the error reduction.The time window of five years is a better choice.Finally,the effect of different features has been verified.The deletion of any feature will all increase the error of the prediction results.Among them,influences come from authors,funds,journal impact factors,and scientific strength have a greater impact on the prediction results.Both authors and journals influence the development of the topic through the Matthew’s effect.The fund provides material security and offcial endorsement for the scientific research of the topic.The scientific strength provides support for the development of the topic from the content.The specific innovation points are as follows:(1)This paper has constructed a more complete forecasting framework,considering the two aspects of document data and citation data,and synthesizing the various influencing factors that have an impact on the development of the topic,thus avoiding the defects of single-factor analysis;(2)This paper has put forward a topic heat prediction method based on LDA and machine learning,applied the machine learning method to the forecasting problem of informatics,combined the internal and external characteristics of the topic,and achieved better prediction results.This is expansion of information science method,and has supplemented the diversity of the method for topic heat prediction to some degree. |