| The explosion of information resources and literatures have brought both opportunities and challenges for extracting insights from massive text data.Unlike conventional data formats,text data are unstructured,making it infeasible to utilize conventional data analytics for knowledge discovery.Text mining is one of the most promising technology to analyze text data.One of the main text data resources are research articles,which convey important research ideas and domain expertise in different fields.In general,it is very time-consuming and laborious to manually inspect research articles due to the large article amount and wide research focuses.It can be very useful to develop text mining-based methods for analyzing research articles considering the potential benefits in terms of efficiency and comprehensiveness.Nevertheless,there lacks a customized text mining-based framework for analyzing research articles.On the one hand,it is unclear what kinds of typical knowledge can be extracted from research articles in conventional formats.On the other hand,there are few in-depth tools to ensure the efficiency and effectiveness of practical implementations.This study focuses on tackle the above-mentioned challenges.A variety of text mining and data visualization techniques have been explored and selected for the automatic research trend identification and result visualizations.More specifically,4951 papers related to building energy conservations are extracted from the Science Direct database as the data resource.A customized data-preprocessing method is then developed to extract the structural characteristics from research articles.The basic information can be automatically extracted,such as the “Publication year”,“Journal”,“Authors”,“Keywords” and etc.A multi-level knowledge discovery method is then developed based on TF-IDF,social network analysis,Latent Dirichlet Allocations and sentiment analysis.Finally,the programming language R is adopted to develop an implementation platform for practical applications.It helps researchers to better understand the overall research trends and dynamics in the relevant fields in a comprehensive and rapid way.The main conclusions of this paper are as follows:This paper verifies the feasibility and effectiveness of text mining technology in the processing of academic literature data.Text mining and visualization methods built in this article can help researchers quickly automate the identification and visualization of academic trends.For the overview of the field of building energy conservation,the key information obtained in this paper includes:(1)The main research topics identified include the following,i.e.,the origin and development stages of building energy efficiency,impacts of climate change and government measures to address climate change,various methods and technologies for improving building energy efficiency,work progress and achievements of energy saving and emission reduction in different stages and periods,research on building envelope,development and utilization of new energy,the importance of energy saving awareness.(2)The main contributing journal for this dataset is ‘Renewable and Sustainable Energy Reviews’,‘Journal of Cleaner Production’,‘Energy and Buildings’,‘Applied Energy’,‘Construction and Building Materials’ and so on.(3)The important research timestamps are 1992,1997,2009 and 2015.(4)Influential researchers include Kamaruzzaman Sopian、Saidur Rahman and Jian Zuo,and five important relationships are Jingjing Jiang and Bin Ye,Vygandas Gaigalis and Romualdas Skema,Yunho Hwang and Reinhard Radermacher,Bj?rn Petter Jelle and Arild Gustavsen,and Baolong Wang and Xianting Li.(5)The most studied keywords include Renewable Energy,Energy Efficiency,Environmental,Economic and Energy Sustainability,Solar energy and China.For the review study research in the field of logistics,the main information obtained in this article includes:(1)The important research timestamp is 2010.(2)The main contributing journal for the dataset is ‘Journal of Cleaner Production’,‘Renewable and Sustainable Energy Reviews’,‘European Journal of Operational Research’,‘International Journal of Production Economics’ and ‘Omega’.(3)Influential researchers include José M Merigó,Robert Pellerin and Joseph Sarkis,and three important relationships are André Langevin and Martin Trépanier,André Langevin and André Langevin,André Langevin and Robert Pellerin.The most studied keywords include logistics,sustainable development,green supply chain,transportation development,closed-loop economy,humanitarian logistics and artificial intelligence.(4)The main research topics identified include the following: the management and development of supply chain and the development of green logistics.This paper develops a customized method for the efficient knowledge discovery from massive research articles.A number of text mining techniques have been integrated to enhance the efficiency and effectiveness in text data analysis.In addition,with the help of literature data in the field of logistics and building energy,research methods and interactive software are verified for their practicability and applicability,then it shows that an interactive interface has been developed for the convenience in text data analysis and result visualizations.This study provides both theoretical methods and practical tools for mining research articles.It is also helpful for broadening the research methodologies in literature review. |