The public attitude towards health care changes from the passive disease diagnosis and treatment to the active management of their own health with the development of social economy.In recent years,the online medical communities combined the Internet and medical information have emerged with the development of the Internet.In-depth exploration and analysis of the online medical communities,can grasp the hot topics and emotional tendencies that users are concerned with,at the same time,they can help users to find and obtain information quickly.The early research on online medical communities focused on the manual annotation.With the development of the Internet,these methods are no longer suitable for processing a large number of texts.Therefore,related methods in machine learning are gradually applied in analyzing the data from the online medical communities.As a means of information exchange in the community,texts are usually subjective,and most important features of texts are the topic and the emotion.Therefore,the thesis explores the intelligent methods to conduct in-depth research on hot topics and emotional tendencies that community members are generally concerned about.Firstly,the thesis introduces the online medical community,hot topic recognition and text sentiment analysis,the main problems in topic recognition and sentiment analysis of the online medical community are introduced,and the related theories of topic recognition and sentiment analysis are introduced.Then,a hot topic recognition method for online medical community is proposed.Considering the distribution features,the short and non-standardized features of online medical texts,we propose the Medical Sentence of LDA(MS-LDA)topic model.We assume that the topics are generated by sentences,employ the Gaussian function to fit word distribution and modify the word frequency with the relevance weight in texts to adjust the proportion of different words in sentences.At the same time,the professional knowledge of the medical field is introduced to extract medical concepts for topic clustering.Experimental results indicate that the perplexity of MS-LDA is lower and the consistency between words is stronger when compared to comparison models,at the same time,hot topics related to diseases and potential links between the topic and the disease can be automatically mined.And then,for a large number of unlabeled sentiment data in online medical community,an unsupervised domain adaptive sentiment analysis method for online medical community isproposed.We combine polarity transfer processing and n-gram to extract text features,and improve the domain-adversarial neural network(DANN)to extend the two classification to multiple classifications for sentiment classification of online medical community.In the training process of the DANN,the background knowledge of the medical field can be automatically learned through the mapping from the source domain to the target domain.The experimental results show that the feature extraction combined with DANN achieves the best results compared with other sentiment classification methods.At the same time,the text is automatically tagged in an unsupervised manner,potential links between the sentiment and the disease can be automatically mined.At last,we conclude our work and show our plan in future work. |