Font Size: a A A

Study On Method Of Coreference Resolution For The Knowledge Graph Of The Eighth Grade Biology Textbook

Posted on:2022-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2517306347951099Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In In recent years,with the rapid development of education information,various types of knowledge bases and other online teaching resources have continuously emerged,and has narrowed the gap between urban and rural education resources.However,due to the large scale of Internet information,most of the teaching resource information appears in unstructured or semi-structured forms.The data is messy and disordered,and the fragmentation is serious.And textbooks only have a knowledge framework,which is not enough to provide students with a complete and comprehensive knowledge content system.The emergence of knowledge graphs provides teachers and students with a detailed subject knowledge system and clear knowledge organization relationships,which helps improve students' learning efficiency and has guiding significance for the teacher's preparation and class.Referential refers to the use of abbreviations or pronouns in the following to replace a word that has already appeared above.coreference resolution can solve the problem of unclear referents in the text.In the main process of constructing a knowledge graph,coreference resolution is often ignored,because a large-scale corpus can make up for the lack of coreference resolution caused by the lack of reference resolution.However,in small-scale corpora(such as textbooks),although the proportion of pronouns in the entire corpus is very small(less than 5‰),this ignorance will reduce the high fidelity of the knowledge graph and easily obscure the key content of the knowledge graph.Weaken the relationship between knowledge points.In order to solve the above problems,this paper proposes the rule and semantic-based coreference resolution algorithm,which improves the accuracy of the knowledge graph.The specific work is summarized as follows:(1)In this paper,based on the corpus of the eighth grade biology textbook published by the People's Education Edition,we propose the rule-based and semantic coreference resolution method to resolve the referential content of the third-person pronouns "it" and "them".This paper proposes two rules to filter antecedents prior terms and to select efficient resolution features:part of speech,grammar,location information.At the same time,more attention is paid to the local semantics of the surrounding text using pronouns.Compared with the other three algorithms,the main idea brings better precision and recall.(2)Constructing a knowledge map based on eighth grade biology textbooks.the TF-IDF algorithm is used to extract concepts,and the relationship triples are extracted based on dependency syntax analysis and semantic role labeling?Generating A0,predicate,and A1 structures;subject-predicate-object structure;verb-object structure after attributive;relational triples of subject-predicate verb-complement structure of prepositional phrase,and pass Clustering of relation words generates 8 types of relations.(3)Exploring the influence of coreference resolution on educational knowledge graphs,this paper uses three gradually increasing corpora to conduct experiments,including the "Fish" section,"The main groups of animals" chapter and the entire biology textbook as the corpus,presents them with the visual analysis tool Gephi,and compares the two educational knowledge graphs before and after the coreference resolution.We can find that statistically,although the number of nodes in the knowledge graph has not changed,after coreference resolution,the growth rate of the number of node edges and the shortening rate of the average path length are far greater than the proportion of the third-person pronouns in the corpus;Visually,it can be seen that the size of the focus and the location of nodes have changed significantly,and the relationship between the knowledge points is closer,which is more in line with the original intent of the textbook.The result shows that the coreference resolution improves the high fidelity of the educational knowledge graph and keeps the educational knowledge graph highly consistent with the textbook.
Keywords/Search Tags:Educational knowledge graph, Coreference resolution, Semantic analysis, Information extraction
PDF Full Text Request
Related items