| The research work is funded by the project "Tai Chi people network" based on semantic information under the Department of Education in Jilin Province. The contributions of this paper are as follows:the web content extraction in the field of Tai Chi, dictionary construction, ontology construction, ontology storage and query, and system prototype development.The web page structure models of Tai Chi domain websites are analyzed. Several models are designed for extracting information from different types of web pages. A coarse-grained, a medium-size-grained, and a fine-grained web page analysis and information extraction algorithms are designed respectively. The three algorithms are implemented and run to achieve experiment data and to evaluate the algorithms.In addition to downloading some word entries for Tai Chi domain on Sogou Cell Thesaurus web site, I have designed and implemented some algorithms to extract some word entries in the Tai Chi domain on Baidu Encyclopedia, and on Micro Encyclopedia of the Interactive Encyclopedia. Many word entries are extracted from Tai Chi domain web sites, such as ZhengLei Chen’s website. I have designed and implemented an algorithm to extract Tai Chi Inheritors word entries, and run them on ZhengLei Chen’s Tai Chi Heritage Pedigree to test the feasibility and effectiveness of the algorithm.Structured properties extraction algorithm is designed to extract properties in Tai Chi successors’ home pages. Tai Chi domain family ontology, Tai Chi inheritors’ ontology, as well as Tai Chi halls and members ontology is constructed respectively. I have implemented and run the algorithm on Tai Chi successors’ home pages, and extracted the thirteenth inheritors’ properties and instances to test the validity of the above algorithm.An OWL ontology storage model and algorithm based on a relational database is designed. The search keywords are semantically expanded by using the Tai Chi domain ontology and the user is recommended to search the information with the expanded search keywords. The ontology data acquisition and storage algorithm is run on the fragment of the Tai Chi hall and Tai Chi members ontology to verify the validity of the ontology storage model proposed by this paper.A system prototype is developed by using popular development tools and development environment to verify the feasibility and availability of the models and algorithms proposed in this paper. |