Font Size: a A A

Analysis And Design Of Vegetable Growing Information Corpus Construction Method

Posted on:2018-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhengFull Text:PDF
GTID:2323330518965147Subject:Agricultural Extension
Abstract/Summary:PDF Full Text Request
Corpus is a kind of database used in linguistic research. The small corpus has the characteristics of clear target, strong collection of the corpus, and the flexibility of the corpus. In order to meet the demands of scientific knowledge, scientific research,production practice and teaching materials and technical information, Small corpus which building a vegetable planting information is practical significant to provides a variety of corpus. This paper presents the design and realization process of corpus collection, annotation and retrieval system of vegetable planting information corpus,which can quickly retrieve the necessary vegetable planting information in a variety of ways.First of all, some of well-known large corpus and some special corpus research status are introduced, analyzed and summarized, the construction of vegetable planting information corpus and development trend is prospected, every step of the construction of corpus respectively are introduced and analyzed, On this basis, we determine that the vegetable planting information corpus construction steps.Secondly, selected large domestic agricultural information websites -- Chinese vegetables, which the vegetable planting information corpus was collected, using Octopus collector on the website of the vegetable planting related information were collected automatically, and then constructed on the data collected were screened for a variety of documents in the corpus is needed, and the corresponding pretreatment of modified typos and not complete document. Corresponding preprocessing in modifying the wrong word and incomplete documentation.Then, the TEI annotation model was used to label the collected vegetable information, based on vegetables and vegetable planting technology classification and Fu code, the keywords, vegetable types, On the beginning of each document of the corpus, the keywords, vegetable types, vegetable cultivation techniques are given,The annotation of document is given in key words, words, part of speech.Finally, the retrieval system of vegetable information corpus was analyzed and designed. According to the actual retrieval needs of vegetable planting information,the specific search types needed for planting vegetable information corpus include keyword search, classification of vegetable classification, retrieval of key words, full retrieval, index database designed for retrieval system, On this basis, the concrete Implementation of various types of retrieval is given. Finally, the retrieval system of vegetable information corpus is designed. Experiments show that the vegetable retrieval information corpus retrieval system can quickly retrieve all kinds of corpus,with high precision and recall rate, which can meet the demand for vegetable planting information in vegetable cultivation research, teaching and production labor.
Keywords/Search Tags:Corpus, vegetable planting information, data collection, corpus tagging, corpus retrieval
PDF Full Text Request
Related items