Font Size: a A A

Based Collocation And Syntax Parser Functions Match

Posted on:2014-12-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:R H XuFull Text:PDF
GTID:1265330401469689Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Parsing technology is the core technology in the field of information processing, but also the difficulty lies. This paper argues that collocation knowledge and syntactic structures are closely linked, collocation knowledge can help improving the accuracy of parsing. With the thinking of building the collocation library and parser mutually, this paper introduced three parsers which was built by HIT, Berkeley and Stanford University; proposed two automatic extraction methods of large-scale collocation respectively on the basis comparative analysis of the results of three parsers. The first method is based on the syntactic analysis of dependencies which comparing the parsing results on the same word pairs; second method is based on the phrase relations of syntactic analysis which comparing the parsing results on the same syntax level. The experiments show that both the two automatic extraction methods of large-scale collocation can obtain large-scale collocation effectively, of which about500million collocation types can be obtained from14years Xinhua News Agency with the phrase-based method, sampling with a precision of about84%.By using the Collocation resources automatically extracted, this paper selected four filters for preferred collocation; found an optimal combination way between precision and scale, and built a large-scale collocation knowledge base which concluded14data items and million collocation types. the knowledge base sampled with accuracy of more than90%. By analysing the14data items in the knowledge base individually and associately, the internal rules of the relevant attributes of collocation such as collocation type, collocation distance were further excavated.After Building a large-scale, high-quality collocation resources, collocation knowledge was added into the parsing algorithm based on grammar function matching and built a new parser which based on the collocation knowledge and grammar function matching (CGFM). Using Xinhua News corpus as an open test corpus, in the single parser individual performance evaluation, the open test F value of CGFM parser was about80%. Compared to the added collocation knowledge parser, the F value of parsing was nearly4%performance increase. In the horizontal evaluation which contained CGFM parser, HIT parser, Berkeley parser and Stanford Parser, CGFM parser performanced prominently, and kept ahead in both phrase analysis evaluation and dependency analysis evaluation.
Keywords/Search Tags:Collocation, Knowledge base, Parse, Grammar function match, Horizontalevaluation
PDF Full Text Request
Related items