Font Size: a A A

Study On The Use Frequency,mutual Information And Extension Of Common Idioms

Posted on:2023-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:M M ZhangFull Text:PDF
GTID:2555307097483314Subject:Chinese Language and Literature
Abstract/Summary:PDF Full Text Request
Idiomatic language is a fixed phrase,which is parallel to the language units such as idioms,proverbs,and epithets,and is a type of familiar language that is quite active in the use of language,and has important research value in lexicology.It is generally accepted in academic circles that idioms are fixed phrases commonly used in stereotyped definitions,and at the same time,it is also believed that some idioms can insert some components according to the needs of expression.These understandings involve the commonality and coagulation of idiomatic terms.At present,the results of studies on these two attributes of idiomatic terms from a quantitative point of view are rare.On the basis of the research of the sages,this paper tries to use the method of quantitative linguistics,based on the modern Chinese annotation corpus of the State Language Commission,and measures the common usage and coagulation of idiomatic terms with the frequency of use,mutual information value and expansion degree respectively,so as to discuss the quantitative characteristics of common idioms on the basis of this,hoping to deepen the definition of idioms.The full text is divided into five chapters.The first chapter is an introduction,which introduces the definition and structural distribution characteristics of idiomatic terms,the mutual information of idiomatic terms,the expansion of idiomatic terms,and introduces the research objects,ideas and methods of this research.Chapter 2 examines the frequency with which common idiomatic terms are used.By extracting the frequency of use,calculating the frequency of use,and using the characteristics shown by the frequency scatterplot,the four levels of high,medium and high,medium and low are divided,and the structural distribution characteristics and corpus labeling characteristics of different levels of idiomatic expressions are analyzed.Chapter 3 examines the mutual information of common idiomatic terms.Drawing on the method of inter-information calculation of phrases,according to the structural characteristics of idiomatic words,combined with the grammatical structure relationship and sense of language,the mutual information values between the constituent components of idiomatic expressions are calculated by software extraction and manual verification,and the characteristics of high,medium and low are divided into high,medium and low levels through the characteristics of mutual information scatterplots,and the structural distribution characteristics and corpus labeling characteristics of different levels of idiomatic expressions are analyzed.Chapter 4 examines the degree of extension of common idioms.Investigate the expansion type,expansion component and expansion position,use software extraction and manual verification methods,simulate the new calculation method through numerical relations in different positions,and finally calculate the expansion degree of idiomatic terms,divide the characteristics of high,medium,low and non-scalable through the characteristics of the spread degree scatter chart,and analyze the structural distribution characteristics and corpus labeling characteristics of different levels of idiomatic terms.The fifth chapter is based on data analysis,thinking about the inclusion criteria of idiomatic dictionaries and other issues,and also tries to provide suggestions for problems in the corpus labeling process by comparing the labeling of idiomatic terms.
Keywords/Search Tags:Idiom, Use Frequency, Mutual Information, Extension
PDF Full Text Request
Related items