Font Size: a A A

Research On Knowledge Organization & Content Mining Of The Chinese Local Chronicles

Posted on:2008-07-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Q HengFull Text:PDF
GTID:1115360245498663Subject:History of science and technology
Abstract/Summary:PDF Full Text Request
In the 1950's,the agricultural produce materials were extracted from more than 6,000 kinds of Chinese local chronicle books from all kinds of libraries of China in Wan Guoding 's charge.The materials were compiled to a series with title of Local Chronicle:Produce, with 431 volumes,and about 30,000,000 characters.It includes each aspect of agricultural production and the main content is about the zoology and botany variety resources and the raising and cultivating techniques.The series has a strong systematic function and has preserved the agricultural production materials of Ming Dynasty,Qing Dynasty and Republic of China.Because of its extremely high value on the materials,such as the agricultural science and technology and the economic history,related domestic and foreign scholars have put emphasis on it.However,the series is a hand-written and rare copy,extremely crisp and easily broken and not convenient to be used,it appears extraordinarily important and urgent to employ modern information technology to protect,disseminate and make use of it.Taking Local Chronicle of Guangdong:Produce as the example,this thesis attempts to explore the digital methods on the Local chronicle:Produce.We construct the information organizing system of the production,and the functions of it include the full text retrieval, indexing the produce names and the cited book titles.We also have a statistical analysis of the production and the cited books in the series,stressing on the alternate names of the production,the classification of the production and the citing ways of the cited books.At last,based on the data of the cited books of LingNan CongShu,some bibliometric analysis are carried out from the historic periods of the books,the highly cited books,the original regions of the authors,and the disciplines of the cited books.Firstly,the production analysis includes a statistical analysis of the produce,a research of produce classification,and a research of alternate name. (1) The statistical analysis of the produce includes the complete production data statistical analysis based on all the produce data from Local Chronicle of Guangdong: Produce according to the historical period and the region.Historical period:Calculating and analysizing produce average and size average for each book from four periods such as the Yuan Dynasty,the Ming Dynasty,the Qing Dynasty and the Republic of China,we conclude:Only one book from Yuan Dynasty,does not have statistical significance;The most produce average is for the Ming Dynasty books, more for Republic of China,least for the Qing Dynasty:The biggest size is for Republic of China books,bigger for Qing Dynasty,the smallest for Ming Dynasty.In general,from Ming Dynasty to Qing Dynasty to Republic of China,the produce are more and more detailed,the reason is that the Chinese science and technology in modern times were developing,and that the west science and technology and culture were spreading into China in the time,which influenced on compiling the Chinese local chronicle books.Region:All the local chronicle books are firstly divided to three types as the province level and the district level and the county level.According to the statistical analysis,we learned that each book produce average decreased gradually from the province to the district to the county,which is tallied with a natural law of vast territory with abundant resources,the small with the rare.Secondly,all the local chronicle books are classified to four bigger regions as Western Guangdong,Pearl River Delta,Northern Guangdong and Eastern Guangdong.The statistical result indicated that produce average is decreased from Western Guangdong,Pearl River Delta,Northern Guangdong and Eastern Guangdong gradually.(2) The produce classification:Each produce is classified into four categories as plant, animal,mineral,goods.The items set up had several characteristics:Item name expressing multi-aspect content,the classification standard not one,item name revealing deferent level. The plant classifying basis are attribute,economical use,appearance characteristic,living condition,domestication or not,modern biology classification system.The animal classification basis are attribute,appearance characteristic,living condition,domestication or not,modern biology classification system.The goods classification basis are attribute, quality of material,manufacture way,raw material,transport mode.(3) The alternate name:Many produce have different multiple names,whose expressions and origins are various,such as having alternate name words,the tabooed naming,the region naming,the literature naming and the special profession naming. Secondly,the statistical analysis of the cited books includes a statistical analysis of the whole citation data,and a analysis of the citing way,the bibliometric analysis of the cited books from Lingnan Congshu.(1) The statistical analysis of the whole citation data is carried out from the original books of local chronicle and the cited books.A statistical analysis of the original books of local chronicle includes the historical period and the region statistical analysis.Statistical analysis of the historical period:Analysizing the citing instances of the original books of local chronicle according to four historical periods like the Yuan Dynasty, the Ming Dynasty,the Qing Dynasty,Republic of China.Only one book for the Yuan Dynasty,does not have statistical significance.The citing mean value was increasing progressively as a generation order of the Ming Dynasty,the Qing Dynasty and the Republic of China.And Republic of China's mean value is higher than other two far,which further explaining that that the Chinese science and technology in modern times were developing and that the west science and technology and culture were spreading into China in the time,influenced on compiling the Chinese local chronicle books deeply.Statistical analysis of the region:All the original books of local chronicle were divided into four regions like Western Guangdong,Pearl River Delta,Northern Guangdong and Eastern Guangdong,and the citing instances were analysized.According to the statistical analysis,we learned that the province books have a biggest citing mean value.We know, the wider scope,the more products,the more literatures cited when the local chronicle books were compiled;and the officials hired the most outstanding scholars to compile the books,who had a rigorous writing manner and an excellent style;In addition,some authors of the private works cited broadly from the encyclopedical sources.Thus,highly-citing books were composed.Among the local regions,the most cited books is from the original books of Pearl River Delta,next is from Western Guangdong,Eastern Guangdong, Northern Guangdong in order.Statistical analysis of the cited books:All the cited books are divided into two sorts like the poetry and ballad and proverb which were scattered and unable to belong to one monograph,like the papers and the monographs.Poetries and ballads and proverbs,cited 2141 times,were the historical materials from the literature forms,which were indicated to repose sentiment by the produce at that time. There are three origins of these materials:the documents of Local LingNanner,the documents of the literators who served as officials in LingNan,the documents of the folk literature.The papers and the monographs are cited 29529 times,whose constitution characteristics are that the local literatures of LingNan were cited massively,that Interview Book are cited massively which recorded the real produce instances,that the Chinese medicine literatures were cited massively which demonstrated that the significant medicinal value of the LingNan production.(2) The citing ways:the article extracted all the title-cited patterns and the citing expression patterns from Local Chronicle of Guangdong:Produce.The title-cited patterns were comprised of the literature titles the author name,the literature title + author name. The citing expression patterns were comprised of the front sign type,the back sign type and the enclosed type.All the patterns were implemented to recognize of the cited books.(3) The bibliometric analysis was carried out to aim at the cited papers and monographs from LingNan CongShu,which cited 2296 times and 351 sorts books.The period statistical analysis revealed that the cited book sorts sequence from high to low is Period of the Song Dynasty and the Yuan Dynasty,Period of the Qing Dynasty,Period of the Three Kingdoms and Jin Dynasty and NanBei Dynasty,Period of the Ming Dynasty,Period of the Sui Dynasty and Tang Dynasty and the Five Dynasty,Period of the Qin Dynasty and Han Dynasty,Period of the pre-Qin era.The most sorts of the cited books for Period of Song Dynasty and Yuan Dynasty revealed that the science and technology in the time was most prosperous in the China feudal society.The frequency statistical analysis revealed that the highest reached 207 times,owing to an ancient book titled Guangdong Xinyu,written by Qu Dajun,which was the most value reference for Deng Chun,the author of Lingnan Congshu.The region statistical analysis revealed that the authors from the Yangtze River downstream owned the most books,that the authors from LingNan area held the highest frequency,that in other areas like Huang River valley,Two Lakes areas,the southwest,both the cited book sort and the frequency were lower,even that no one cited book belonged in the northeast area.The discipline statistical analysis revealed that the cited books of the miscellany books had the most sorts and the highest frequency,which told us that the miscellany books were the main information sources for Linnan Congshu.Others are less like the note and commentary books,the Chinese traditional medicine books,the produce and natural science books,the history and geography books,literary work corpus,the notes on poets and poetry books,the agricultural ancient books,the local chronicle books sponsored by the officials.And all the statistical analysis outlined the material origins and the content structure of LinNan CongShu.Thirdly,the production information organizing system includes the full text database,the production index,and the cited book index.(1) The full text database construction analysis the styles of writings to outline a standard narration form of the produce in the local chronicle books to design the full text database field.Full text retrieval,keyword retrieval,cluster retrieval and data statistic are the main functions of database system.(2) The production index subsystem recognize the alternate names of the production and construct product name index dictionary with the pattern recognition methods,which is applied to index the productions together with the formal production name dictionary.The subsystem's functions are pattern maintaining,synonym recognizing,item database maintaining,index building and browsing.(3) The cited books index subsystem can dig the cited books with the citing linguistic characteristic pattern of the cited books,the linguistic characteristic pattern of the cited books name and the citing linguistic characteristic pattern of the author name,and construct a cited book title dictionary to index the cited books.The subsystem's functions are pattern database maintaining,pattern recognizing,item database maintaining,index building and browsing.The information organization means of the Chinese local chronicles is diversified, however only a few were used in the full text database,the produce index,the cited book index,the statistical analysis of the produce,and the statistical analysis of the cited books. We expect this thesis maybe find out the methods and clues to organize information of the Chinese local chronicles.
Keywords/Search Tags:Local chronicle, Local Chronicle: Produce, Knowledge organizing, Content digging, Local chronicle index, Collation of ancient books
PDF Full Text Request
Related items