| With the increase of social demand and the development of technology,more and more artificial intelligence applications have been implemented,and research directions for different applications have been born.Knowledge graphs provide sufficient prior knowledge for various downstream tasks,so knowledge graph construction technology has also received more attention.Relation extraction technology is one of the key technologies for knowledge graph construction.The ever-expanding relation types and scarce labeled data bring new challenges to the relation extraction technology.Especially in actual scenarios,more predefined relations and non-predefined relations are mixed together,the model is required to be able to identify non-predefined relations under the premise of correctly completing the predefined relation extraction.Existing relation classification models are suitable for the predefined relation extraction task,however,in more complex task scenarios,the model can not be limited to the category characteristics of predefined relations,but need to focus on relation representations.Starting from the feature space representation,this thesis explores the efficient use and fusion of representations to meet various needs in a more open environment.The research content of this thesis is divided into the following parts:First,investigate and analyze the existing models and evaluation methods of relation extraction.For the target task,the dataset is organized,selected and divided,the evaluation index is formulated,and the baseline models are set.Secondly,the relation semantic metric optimization algorithm SMDM based on semantic maximum divergence.The algorithm constructs multiple binary feature spaces based on metric learning,and designs a MDF strategy that combines direct semantics and indirect semantics to obtain better relation metrics.Taking the relation metrics as input,nonpredefined relation extraction is done using spectral clustering.After experimental verification,SMDM can effectively improve the performance of non-predefined relation extraction,and complete predefined relation extraction and non-predefined relation identification tasks.Thirdly,SelfRec is an optimization algorithm for relation semantic representation based on adaptive clustering.The algorithm is based on the feature space constructed in SMDM,and sets up a gating mechanism to combine direct semantics and indirect semantics.It combines with adaptive clustering module and relation classifier module,and directly improves relation representation.After experimental verification,the algorithm further improves the performance of non-predefined relation extraction on the premise of retaining the ability of SMDM to cope with complex settings.Finally,this thesis provides a summary and an outlook for future work.The key research topics of this thesis,including open relation extraction settings,metric learning,unsupervised and self-supervised tasks,are analyzed,and future research prospects and research directions are prospected. |