| As carriers of knowledge,monographs play an essential role in scientific communication.Citation count is one of the most critical measurements for evaluating the impact of monographs.However,Garfield pointed out that citations influenced by many factors besides scientific merits.Therefore,investigating factors affecting citation counts of academic literature became an essential issue in the field of Scientometrics.Several studies have explored factors that have significant effects on citations of monographs.Nevertheless,on the one hand,merely a small fraction of factors and their impact have been investigated,and large quantities of potential factors have not been explored yet.On the other hand,compared with single-factor analysis in previous studies,the results of multi-factor analysis seem to be more reliable.Thus,it is necessary to conduct a multi-factor analysis of factors affecting citations of monographs.Besides,the accumulation of citation counts takes time,which brings difficulties for its use.Thus,it is valuable to predict citations of monographs shortly after publication.This paper conducted studies on factors affecting the citation counts of monographs and predicting five-year citations of individual monograph.We sampled 2,844 monographs published between 1999 and 2009 indexed in the Chinese Book Citation Index(CBKCI)database.We first used the non-parametric test and the multiple linear regression model to investigate whether the selected monograph-related,author-related and early-citation factors have significant effects on its citations.Then,we used six machine learning methods,i.e.,the BP neural network,XGBoost,Linear Regression,Random Forest,KNN,and Support Vector Regression,to predict five-year citation counts of an individual monograph.This paper had four main findings.First of all,the experimental results show that among monograph-related features,the length of the title,book series,subject of a monograph,the quality of publisher and place of publication have a significant impact on citations.Having an English title or not is not a significant factor.Specifically,the length of the title of a monograph has a significant and negative effect on its citations.Compared with plain text titles,the correct use of punctuations in the title increased the citations of monographs.Monographs with a compound title or whose title include labeling marks receive more citations.As for the discipline,monographs published in the fields of law and sociology receive more citations,while those in art,religion,and history obtain fewer citations.Besides,series of books or monographs published by the one hundred excellent publishers tend to obtained more citations.Secondly,among the author-related factors,the location of the institution of the first author and the type of funding significantly influence the citations of monographs.The number of authors and the type of the institution of the first author barely affect citations of monographs.Specifically,monographs not funded by any funding or whose institution of the first author located in Asia except for the Chinese Mainland received more citations.Thirdly,the citation counts in the first two years after publication is the most important factors among the selected features,which have a significant and positive effect on the citations of monographs.Last but not least,five selected machine learning methods,i.e.,the BP neural network,XGBoost,Linear Regression,Random Forest,and Support Vector Regression,achieved high accuracy in predicting five-year citation counts of individual monograph.However,KNN showed a poor performance.Besides,we found that the most crucial feature for citation prediction is the ‘citations of monograph in the first two years after publication’.And other features hardly contribute to the performance of prediction. |