| Due to the massive collection and processing of basic biomedical research and clinical medical data,all aspects of big data lack integration.There is no unified standard between various fields of biomedicine,making it difficult to share different research data.Ontology is a complex knowledge network composed of domain-related entities and relationships between these entities.Using ontology alignment to judge the semantic correspondence between entities is critical for achieving biomedical data interoperability.Existing ontology alignment methods are mainly divided into feature-based strategies and entity embedding-based strategies.Feature-based strategies mainly rely on string similarity.Entity embedding strategies represent entity vectors in vector space.They are good at capturing the semantic information of words to obtain potential alignments that feature-based strategies cannot capture.However,entity embedding representation is often inaccurate,and many methods do not fully utilize the structural information of ontology.This thesis constructs a multi-strategy-based ontology alignment model to solve the above problems to obtain similar entity pairs from multiple perspectives.The main research contents are as follows:(1)This thesis proposes an entity embedding model based on a multi-hop attention graph network.The ontology structure information is added to the entity embedding learning.The graph neural network based on the multi-hop attention mechanism is used to combine the ontology network structure with the description information of the entity itself to learn the entity embedding.The model realizes adding multi-hop node information in the attention mechanism,extending from neighbour nodes to non-neighbour nodes,expanding the receptive field in each layer of the network,and improving the expressive ability of ontology entity embedding.After adding the multi-hop attention map network,the performance of the ontology alignment task is improved compared with the baseline model.(2)This thesis proposes an ontology alignment model combining feature-based and entity-embedding-based strategies.Based on the multi-strategy way,this paper calculates the similarity between entities from multiple levels: the feature-base strategy utilizes linguistic features and structural features,and the entity-embedding-based strategy divides entity embedding learning into three views,namely: synonyms view,structural view,and definition view.This thesis designs a different embedding learning model for each view to combine the characteristics of each view: the synonym view is mainly learned based on entity synonyms;the multi-hop attention graph network is used in the structure view to combine the entity’s features with ontology structure information;for definition view,this thesis uses Wikipedia to search for entity definitions and uses Sentence-BERT(SBERT)for modelling.After obtaining entity embeddings from the three views,two combination methods are used: the weighted average combination strategy and the shared space strategy.Finally,the entity pairs with higher similarity are selected from the feature-based and multi-view entity embedding methods and filtered to generate the final matching result.Combining the feature-based method and the entity-based embedding method can comprehensively consider various information.Compared with the baseline model,this method can obtain a better ontology alignment result.(3)This thesis realizes an entity annotation system based on fusion ontology.In this paper,the fusion ontology generated based on ontology alignment is applied to the entity labelling system to realize the labelling of entity information in the text and link the entity to the fusion ontology and the related literature of Pub Med.Entity annotation based on fusion ontology can realize the interconnection of entities that could not be interconnected before and discover the intercommunication of data,which is conducive to the understanding of biomedical literature by relevant researchers.This thesis proposes a multi-strategy-based biomedical ontology alignment model that calculates the similarity between entities from multiple perspectives and improves the performance of ontology alignment.And it implements an entity labelling system based on fusion ontology,which is conducive to the fusion of biomedical data and knowledge discovery by relevant researchers. |