| Biomedical word sense disambiguation(WSD),as an interdisciplinary of biomedicine and natural language processing,plays an important role in fields of machine translation,speech recognition,information retrieval and gene naming standardization.Due to the existence of biomedical ambiguity,the computer can not correctly understand the meaning of biomedical literature.Therefore,it has become a research focus of experts and scholars to improve the ability of computer for deal with ambiguous words.This paper constructs biomedical word sense disambiguation model based on the research of biomedical information,word sense disambiguation and deep learning model,combined with Graph Convolution Neural Network(GCN),Graph Attention Network(GAT)and Bidirectional Long Short Term Memory(Bi LSTM).29 biomedical ambiguous words from MSH WSD corpus are selected to testify the proposed model,and average accuracy is used to measure its performance.Experimental results show that the proposed method achieves high average accuracy.The specific work is as follows:(1)MSH WSD for biomedical word sense disambiguation and the process of preprocessing the corpus are introduced in detail.The process of extracting disambiguation features,vectorizing them with Word2 vec tool and constructing biomedical WSD graph is analyzed.(2)A biomedical word sense disambiguation model based on GCN is proposed,which extends the traditional convolution operation from European spatial data to non-European spatial data.This model can extract features from word sense disambiguation graph.When the GCN disambiguation model aggregates the characteristics of neighbor nodes to the central node,it cannot determine the importance of the neighbors around the node.Therefore,a biomedical word sense disambiguation model based on GAT is proposed,which uses the attention mechanism to obtain the importance of neighborhood information,reduces the weight of noise node information,and achieves more efficient information dissemination and aggregation.The idea of ensemble learning is studied,and individual disambiguation models are combined into ensemble models.The experimental results show that the average disambiguation accuracy of the integrated model is higher than that of the single disambiguation model.(4)Semi-supervised biomedical WSD model based on GAT is proposed.This paper introduces the workflow of semi-supervised WSD model,which solves the problem of model optimization when there are few training corpora,optimizes the model with the expanded corpora,and ends the training process when the unlabeled corpora are empty. |