Font Size: a A A

A Study On The Construction Of Mongolian News Corpus And Related Issues

Posted on:2017-01-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B Y E EnFull Text:PDF
GTID:1105330485466589Subject:Chinese Ethnic Language and Literature
Abstract/Summary:PDF Full Text Request
Corpus is a collection of real natural language works according to certain principle. Ten million corpus of Mongolian has been constructed, and the processing of Mongolian corpus includes many aspects such as morphological processing, syntax analysis, semantic tagging.This study Constructed Mongolia news corpus based on the published Mongolia press, and studied languages and applications of news corpus, and finally developed a corpus management program applied to computational linguistics various research.The main contents of this paper include:The introduction part introduces the research situation and research background, research purpose, method innovation and research significance.The first chapter discusses the method and process of constructing Mongolia’s news corpus used Mongolian information processing and language of the news research. This paper introduces the collection of Mongolia’s news corpus and classification methods and process in the basis of the situation and the development of corpus, classification and practical value.The second chapter mainly studies the linguistic and pragmatic situation in Mongolia news corpus, includes the application of news language situation and the main features of language application; existing induction of Mongolian orthography in Mongolia news corpus, the main error, and illustrate the problem with example.The third chapter introduces the research and development of the software, includes word frequency statistics software and multifunction search program facilitated text search.
Keywords/Search Tags:Mongolian corpus, news text, Cyril, Mongolian orthography, software
PDF Full Text Request
Related items