Font Size: a A A

Research On Constructing The Textbook Of Todi And Its Related Technology

Posted on:2017-04-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:K D L G R MengFull Text:PDF
GTID:1105330485966587Subject:Chinese Language and Literature
Abstract/Summary:PDF Full Text Request
Todo corpus constructing and developing application programs are significant parts of building "Mongolian Language Resources Platform" and the basis for Todo document digitalization and resources sharing. At present, the problems to be solved in the work is to improve its coding system. On the basis of "Todo corpus building and its related theoretical approaches", the study is the deep and comprehensive reflection of the due coding system of Todo. Currently, the collection and collation of Todo corpus contain the following two aspects:First, the new Todo scripts, a case study of the heroic epic "Jangar" the second is the classical Todo Scripts, a case study of documents. The dissertation compares the different periods of Todo characteristics in the corpus and generalizes the "Nominal Character", "Presentation Character", "Mandatory Ligature" and "Non-Mandatory Ligature" and these should be added in the existing Todo coding system.Contents of the dissertation are summarized as follows:The introduction is mainly about the object of study, the previous studies, the accordance of the topic selection, aim and significance of the topic, the theoretical research methods, the scale of the materials collecting and the structure of the dissertation. The previous research involves three aspects:(1) Todo alphabet Transcript for beginners-(2) the research achievement of Todo-(3) the using trends of Todo in the field of information technology.The first chapter gives detailed descriptions of Todo coding system. (1) It discusses the due characters and punctuation should be added in the present Todo coding system:the major domestic units are MenkSoft company, Beijing Founder Electronics Co., Ltd. Weifang, Beida Jade Bird, Huaguang Phototypesetting and Inner Mongolia computer University-the major R & D foreign countries are Mongolia and Japan. (2) It describes the process of developing "National Standard of Todo Encoding":listing out the due "Nominal Character", "Presentation Character", "Mandatory Ligature" "Non-Mandatory Ligature", "Digit", "Punctuation" and "Control Characters", and these should be added in the existing Founder Todo coding system. (3) It states the implementation issues in National standard of Todo Encoding system:The conversion rules of Todo nominal character to presentation character. (4) It elaborates on usage rules of Todo control characters.The second chapter focuses on Todo corpus. First, it introduces the general instructions of Todo documents and the collection area, catalogs statistics and the progress of collection work. Secondly, it introduces the basis and usage aim of Todo Latin transliteration scheme. Thirdly, it presents the relevant work of the corpus of Todo documents. (1) Construction of the database of Todo documents -- (2) the corpus of Todo documents consisting of text base (Latin transliteration) and Gallery (scan file). Finally, a brief introduction of how to link those Middle Age Document Corpus-" Uighur Mongolian Documents", "Phags-pa script document corpus" and "Todo document corpus" and their communication problems. The advantages and disadvantages of the two methods are mentioned as follows:First, letter as a unit, to formulate three scripts share one Latin transliteration scheme-Second, words as a unit, to develop the electronic dictionary of three scripts vocabularies.The third chapter briefly introduces the development stage of Todo document corpus application. Keeping up with the popularity of the Internet technology and the mobile terminals, the dissertation has applied open source, widely used PHP+MySQL+Apache and cross-platform in the developing of the application program. Based on this, then it gives a detailed description of the database design, program flow chart and application interface.Chapter four describes the processing and application methods of "Jangar Corpus" in detail, taking the new Todo scripts as an example. Dissertation takes information extraction technology as guidance and build entity dictionary of "Jangar corpus". Furthermore, in order to expand "Jangar Corpus", we primarily construct the electronic dictionary of Todo, Cyrillic and Traditional Mongolian vocabularies and 2,526 entries have included until now. Finally, it introduces the design and implementation of "Jangar Corpus" application program.
Keywords/Search Tags:Todo Scripts, Corpus, coding System, The Epic of Janggar
PDF Full Text Request
Related items