| Case frames,which describe the syntax and semantic information of deep structures,are an important language resource for natural language processing tasks,such as syntactic analysis,word sense disambiguation and machine translation.Compared with the case frame bases constructed in English and Japanese,construction of Chinese case frame has not obtained enough attention and there is no Chinese case frame resource.Considering this situation,the goal of this paper is to automatically construct Chinese case frames using large scale monolingual corpus.The construction of case frames mainly faces two key issues.The first is how to make semantic analysis of a sentence so as to obtain the predicate and its arguments.The second is how to cluster arguments according to their semantics.In order to solve the first problem,we use an approach of semantic role labeling to automatically tag arguments,which is a kind of shallow semantic parsing.For the second issue,we study three clustering methods to construct case frames.This paper systematically studies the construction methods of Chinese case frames.The main contributions are summarized as follows.(1)We propose a Chinese semantic role labeling method based on deep learning.The goal of semantic role labeling is to identify and label the arguments of predicates in a given input sentence.In traditional methods,there are some problems,such as the complicated feature engineering,and the lack of dependence between the adjacent word tags.This paper makes the following six improvements.1)We build Bi-LSTM model to automatically learn features.2)In order to gain much more semantic information,we extend the depth of the Bi-LSTM.3)The transition matrix is introduced to restrict the label tagging of adjacent words.4)For exploiting the strong dependencies between labels,we build a tagging decisions model based on Conditional Random Fields.5)We introduce a Gate mechanism to adjust the word representation.6)We explore the use of dependency structure in semantic role labeling.Evaluation experiments on the open test data show that Chinese semantic role labeling system is improved by 1.84%in the F1 score.(2)We design and implement three kinds of clustering algorithms and compare their performance in case frames clustering.We cluster the arguments tagged by semantic role labeling to obtain the case frames.Since the semantics of a predicate depends mainly on the objects,so we focus on the objects to cluster the arguments.We implement three clustering methods,the ChineseRestaurant Process algorithm,an improved K-means clustering algorithm based on maximum distance,and the DBSCAN algorithm.In order to verify the effectiveness of the proposed methods,we make clustering evaluation data.The experimental results show that the arguments with the similar semantics are well gathered together for the same verb.Among them,the Chinese Restaurant Process algorithm obtained the best clustering performance and the F1 score reaches 80.97%.(3)Applying the proposed method,we construct the Chinese case frames of high frequency verbs.We select 30 Chinese verbs with high frequency to construct case frames.The results show that there are 30 semantic frames per verb in average and the arguments with similar semantics are clustered to the same semantic frame very well.The results reflect the richness and accuracy of the obtained Chinese case frames and verify the proposed method further.In summary,the proposed method of Chinese case frames construction has performed well in semantic role labeling,arguments clustering and case frame bases construction through evaluation experiments. |