Font Size: a A A

A Study On Language Formalization

Posted on:2012-10-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:M WangFull Text:PDF
GTID:1115330368976426Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The thesis studies the general principles and methods of the language formalization mainly from the perspective of linguistics and computer sciences. In addition to the introduction, the paper consists of six chapters, including formal phonetics, formal semantics, formal grammar, formal pragmatics and rhetoric, formal writing system, etc. The main content and views of each chapter are summarized below:The first chapter mainly probes into the relation between linguistic form and meaning which points out that the integration of form and meaning is the fundamental principle of linguistics and formal linguistics. It is the guideline of the thesis. This chapter also examines subject support of formal linguistics and its role in the linguistic system, level and basic framework of formal linguistics, etc.The second chapter is formal phonetics. It first studies three phonetic attributes and their internal relation which is the base of formal phonetics and the important reference of designing various phonetic coding scheme and compression scheme. The basic process of formal phonectics consists of sampling, quantization and coding.With the different characteristics of phonetic attrebutes, it can take means of nonuniform quantizatio, differential quantization, Vector Quantization, Frequency domain waveform coding, Parametric Coding,etc in order to improve efficiency and quality of formal phonetics. The chapter also studies voice compression, natualness of speech synthesis, probability model of speech recognition respectively.Chapter 3 studies formal sematics which is the main points in this thesis. It first probes into basic frame of symbolic paradigm and tools including Turing Machine, Finite-State Automaton, Regular Expression, etc and several typical symbolism-based methods of formal sematics including Semanteme Analysis, Logical Semantic Analysis, Parts of Speech, etc. However, these methods are all not desirable because they neglect the infinity of sematic system. Any limited rewritings of sematic system cause the loss of sematic, break its integrity and eventually lead to failure.On the other hand, connectionism is based on physiological structure and regards the human brain as a complex network of interrelated nodes which is characterized by parallel distributed processing, fault tolerance, self-learning, forgetting, rule emergence, etc. It is quite similar to the conceptual network of human brain and therefore it is the ideal model of formal sematics.As a typical tool for describing symbolism, computer language has shown the obvious turning to connectionism with its severe lack of the intelligent ability.Fuzziness is another basic problem of formal sematics. Fuzziness of language doesn't originate from finiteness of language's units or fuzziness of the objective world. It comes from cognitive styles. The processes of comparison and conceptualization are the keys to the emergence of fuzziness which in turn improves the efficiency of cognition greatly. That symbolic paradigm rejects the main content in the process of limited rewriting of sematics is just the fuzziness while connectionist paradigm can wholly cover accuracy and fuzziness.Chapter 4 discusses formal grammar. Conceptual Meanings are explicit and open while Grammatical Meanings are implicit and closed. The process of abstraction of conceptural meanings to grammatical meanings is just from explicitness to implicity, from infinity to finiteness which is restricted by Economy Principle of language development. Conceptual relation is multidimensional and generally related.It is serialization of dimensionality reduction from deep Conceptual Structure to superficial syntactic structure.Grammar is the product of compensation mechanism of multidimensional information loss. The finiteness of grammar unit determines formal sematics. Symbolic paradigm is competent.The chapter also discusses some concrete problems and difficulties of formal grammar including context-free grammar & N-gram, Classification of Words, Chinese word segmentation, part-of-speech tagging.etc. Finally, take"ba"structure as an example which points out that the meaning of"ba"structure expresses the competitive relation among individuals of different kinds.Based on this example, semantic role of syntactic structure is presented including superior competitors, inferior competitors, competitive ways and competitive outcome.Chapter 5 probes into Formal pragmatics and rhetoric which is based on formal context including participant information, external environment, conversational context, linguistry, general knowledge, sociocultural knowledge,etc.Based on practical consideration, formation of formal context no longer distinguishes between linguistry and general knowledge and it is the combination of factors which affects the expression and understanding of meanings.This chapter discusses work pattern of the basic context class constructed by C++ programme in the concrete language communication. Although the discussion is not perfect, it is a new try.This chapter also discusses a special figure of speech-Synaesthesia. Synaesthesia is the transfer or empathy among the five senses and also the communication or fusion of introspective mood and emotion. There are no essential differences between the cognitive psychology of synaesthesia and such traditional figure of speech as metaphor, analogy. They are in the mental continuum between different regions, and thus can be connected together into their general synaesthesia category. Mental continuum is the important reference model of formal figure of speech.The last chapter discusses formal writing system. First it discusses the amount of information—the concept of entropy. It points out that many features of Chinese characters including complex font, great quantity, many differences, huge amounts of information are closely related to the high value of entropy. Linguistic Universalism hidden under entropy can be found when further observed. The complexity of lexical system is the fundamental standard for measuring the development levels of language.The second part of this chapter explains the concrete content of formal writing system.It focuses on Internal Code, external Code and graphemic Code including advantages and disavantages of various formal schemes. Finally it discusses the basic principles and realization of Character Recognition.
Keywords/Search Tags:Language, Formalization, Phonetics, Encoding, Speech Recognition, Semantics, Connectionism, Symbolism, Fuzziness, Syntactics, the Ba Sentence, Pragmatics, Rhetorics, Synaesthesia, Character System, Entropy, Information, Unicode
PDF Full Text Request
Related items