Hybrid methods for acquisition of lexical information: The case for verbs

Posted on:2010-08-01

Degree:Ph.D

Type:Thesis

University:The Ohio State University

Candidate:Li, Jianguo

Full Text:PDF

GTID:2445390002471836

Subject:Language

Abstract/Summary:

Improved automatic text understanding requires detailed linguistic information about the words that comprise the text. Particularly crucial is the knowledge about predicates, typically verbs, which communicate both the event being expressed and how participants are related to the event. Although the field of natural language processing (NLP) has yet to develop a clear consensus on guidelines for building a verb lexicon suitable for applications in NLP, class-based construction of verb lexicons (e.g. Levin verb classification) with explicitly stated syntactic and semantic information has proved beneficial to a wide range of NLP tasks in combating the pervasive problem of data sparsity and increasing coverage. Such broad coverage dictionaries and ontologies are difficult and costly to create and maintain by hand, it is therefore desirable to learn them from distributional information, such as can be obtained from unlabeled or sparsely labeled text corpora. To this end, this thesis will primarily address the following three questions:;First, deriving Levin-style verb classifications from text corpora helps avoid the expensive hand-coding of such information, but appropriate features must be identified and demonstrated to be effective. One of our primary goals is to assess the linguistic conditions which are crucial for lexical classification of verbs. In particular, we experiment with different ways of mixing syntactic and lexical information for improved verb classification. Second, Levin verb classification provides a systematic account of verb polysemy. We propose a class-based method for disambiguating Levin verbs using only untagged data. The basic working hypothesis is that verbs in the same Levin class tend to share their subcategorization patterns as well as neighboring words. In practice, information about unambiguous verbs in a particular Levin class is employed to disambiguate the ambiguous ones in the same class. Last, automatically created verb classifications are likely to deviate from manually crafted ones, therefore it is of great importance to understand whether automatically created verb classifications can benefit the wider NLP community. We propose to integrate verb class information, automatically learned from text corpora, into a particular parsing task, PP-attachment disambiguation.

Keywords/Search Tags:

Information, Text, Verbs, Particular, Lexical, NLP, Class

Related items

1	A Report On The Translation Of Lexical Verbs And Phrasal Verbs In Biographical Literature
2	Semantic ambiguity in the lexical access of verbs: How data from monolinguals and bilinguals inform a general model of the mental lexicon
3	Lexical Information Density In Global Environment Of Policing
4	A Study On Monosyllabic Digging-class Hand Verbs In Mandarin
5	Chinese L2 Learners' Acquisition Of English Psych Causative Verbs
6	Exemplifications Of Verbs' Grammatical Information In English-Chinese Learner's Dictionaries
7	Study Of Verbs Bound With Adverbial And Their Lexical Chunks In Modern Chinese
8	Research On Sense Usage Of Chinese A-class Polysemous Verbs
9	A Corpus-Based Study On The Use Of English Psych Causative Verbs In College Students
10	A Study On The Cognitive Process Of Pragmatic Information Of Chinese Verbs Based On Eye-tracking Experiment