Font Size: a A A

Adaptive Web Information Extraction Research Based On Connectivism

Posted on:2020-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y J BaiFull Text:PDF
GTID:2417330572489673Subject:Education Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,people can obtain continuously updated learning resources through the network.By integrating,structuring and storing these learning resources,they can link to learning resources,facilitate the processing and reuse of learning resources.This process echoes the idea "learning is to establish links between nodes and knowledge is constantly updated" supported by the connectionism.In order to realize this process,it is necessary to study the adaptive Web information extraction technology and extract the semi-structured or unstructured Web content into the structured information.This thesis uses elite courses on MOOC platform as the research object.On the MOOC platform,knowledge is stored in the web in units of courses.This thesis studies adaptive web information extraction by using mainstream information extraction technology,combining the knowledge view and learning view of connectionism.The research has gone through the extraction process from course attribute to course relation.Firstly,this thesis proposes a course attribute extraction method based on the combination of template and feature.The method contains the following steps: excavating the common part of website via using computing text node information entropy;identifying the optional parts;finding out the extracting templates;sampling target extraction information;combing four types of local text characteristics which are irrelevant to the text content;and finally obtaining the resulting vectors in terms of different course attributes and using feature vectors as template to filter the result.Furthermore,this thesis proposes a cross-web course relationship extraction method.Three relationships of the course are preset in the research.Different course attributes are selected as comparison data sources for different relationships.Different comparison methods are designed for attribute information of different text types(descriptive text and entity text).On this basis,the relationships between courses were extracted according to the priority of the relationships,and the extracted results were stored in the neo4 j database.Finally,this study verifies the feasibility of attribute extraction method by extracting the attributes of 300 courses from three major domestic MOOC platforms and taking accuracy and recall rate as criteria.The method of relation extraction is used to realize partial relation extraction of 30 courses.In addition,this thesis has realized the construction of higher education computer course knowledge map,which will help improving learners' ability to find knowledge online,building personal knowledge network,providing help for the existing research in the fields of course recommendation,course retrieval,course planning and course design.
Keywords/Search Tags:Adaptive Web Information Extraction, Connectivism, Attribute Extraction, Relationship Extraction, Knowledge Graph
PDF Full Text Request
Related items