Chinese collocation extraction and its application in natural language processing

Posted on:2008-09-03

Degree:Ph.D

Type:Dissertation

University:Hong Kong Polytechnic University (Hong Kong)

Candidate:Li, Wanyin Claire

Full Text:PDF

GTID:1445390005462247

Subject:Computer Science

Abstract/Summary:

The traditional approaches in collocation extraction mainly use statistical models based on co-occurrence association measures, which lead to poor performance both in terms of recall and precision. Collocation extraction in this study explore methods to use collocations features in terms of statistical significance as well as syntactic and semantic information.;The first part of this study investigates how to adapt a well known statistical-based system, Xtract for English, for Chinese collocation extraction. In addition to parameter tuning for Chinese, an enhanced algorithm bad on mutual information is developed to extract collocations with relatively low frequencies to improve recall performance. The second part of this study investigates methods to take into consideration of syntactic information to eliminate pseudo collocations and identify low frequency collocations which suit certain syntactic patterns. The syntactic information is based on Part-of-Speech tagging patterns which are obtained from a chunked Chinese corpus. However, the collocation extraction algorithm does not require the testing data to be chunked. The third part of this study investigates methods to take into consideration of semantic information to further improve recall of collocation extraction by using synonym information. The last part of this research explores how to make use of collocation information in word sense disambiguation (WSD). Results show that collocation information can improve the performance of WSD ranging from 3% to 10% using different data sets.

Keywords/Search Tags:

Collocation, Information, Chinese, Performance

Related items

1	Analysis And Study Of The Characteristics Of Chinese Three-part Causative Complexes Based On Relational Word Collocation
2	Research Of Functions Of Collocation In Students' Reading
3	The study on automatic Chinese collocation extraction
4	Information Processing-Oriented Researches On The Semantic Collocation Between The Verbs And Objects In Modern Chinese Language
5	The Representation Of Chinese EFL Learners’ Second Language Collocation And Influencing Factors
6	Research On V+N Collocation Of Frequently Used Monosyllabic Action Verbs In Teaching Chinese As A Foreign Language
7	A Corpus-based Study Of Collocation Patterns And Motivations In English-Chinese Translated Movie Scripts
8	A Survey Study Of Chinese Learners' Behavior Of Verb-Noun Collocation
9	A Study Of Verb-Noun Collocation Errors Of Chinese College Non-English Majors
10	Collocation Of Words Between English And Chinese, Comparative Analysis And Lexical Collocation Teaching Strategies In Teaching Chinese As A Foreign Language