| China and Thailand have a long history of friendly communications.With the development of the times,now China and Thailand are not only politically friendly,economically close,in the military also have more closely linked.In short,timely and effective access to important information in Thai news reports,has an important role in the friendly development between China and Thailand.In a Thai news report,there are some sentences that describe news events,which are the core of a news story.Thai news event extraction can extract event information from unstructured news texts and present them in structured form,enabling quick and efficient access to the required news information.Therefore,this paper aims at the key issues in the extraction of Thai news events,mainly completed the following characteristics of research work:(1)Constructs a corpus of Thai news events.Thai news corpus plays a fundamental role in the study of Thai news events.Aiming at the current lack of research on the corpus of Thai news events,this paper defines the Thai news events,and carries out a series of annotations such as events,event categories,triggers and annotations of the received Thai news corpus,Thai news event corpus is stored,and the Thai news corpus is constructed.(2)Automatic extraction of triggering words in Thai news events.In the initial stage of the study of Thai news events,there is no scholar to construct a Thai-triggered vocabulary.Therefore,this paper constructs the initial Thai news event triggered,and proposes a method based on the combination of trigger word list and machine learning to extract the triggering words of Thai news events.Experiments show that it is feasible to use this method to extract the trigger words of Thai news events,and the effect is ideal.(3)This paper proposes a method of extracting Thai news events based on cross-language semi-collaborative training.Taking into account the lack of Thai news corpus and the use of traditional self-training methods for the extraction of Thai news events from top to bottom when the error,resulting in the Thai news incident extraction system performance is not high.This article will be from the Chinese event extraction system to obtain the information mapped to Thai,and then guide the Thai news event extraction.Experiments show that the combination of monolingual and cross-language semi-collaborative training has significantly improved the performance of the Thai news incident extraction system.(4)This paper proposes a method to identify the elements of news events based on maximum entropy in Thailand.Firstly,the characteristics of Thai language are analyzed,the Thai dependent tree is constructed,and the candidate event elements are obtained according to the predefined event template.The maximum entropy model is constructed according to the triggering word,the context feature and the Thai dependent relation feature to realize the recognition of the elements of the Thai news event.The experimental results show that the fusion of Thai dependency characteristics can more effectively identify the elements of Thai news events. |