Font Size: a A A

Data Classification And Zoning Issues On China's Economy

Posted on:2004-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2206360092486867Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
This paper emphasizes on the study of data grouping and its application in practice. It draws lessons from the advantages and disadvantages of Akaike Information Criterion (AIC) which is commonly used in deciding the order of Auto-regression model and of Clustering Analysis. Aiming at the question of how to determine the number of groups , this paper discusses several methods. It learns from the basic thought of Hierarchical Clustering Methods (HCM), which groups objects by comparing the sizes of distance or similar coefficients between objects, meanwhile, combines with the Optimal Split-Plot Designs (OSPD) in ordered samples, and synthesizes the intuitive property of HCM and the character of simplicity and the ability in finding out the accurate solution of OSPD. With history data, this paper assumes that data from the same group come from the same distribution, and so do the history data. Therefore, each sum of squares of deviations is the times of the consisitent and unbiased estimate of variance. So, if the division is reasonable, the sum of squares of deviations attained through the data to be grouped and the one attained through history data should be very close. Considersing that, this paper defines the criterion called grouping error and shows the best data-grouping method is the one which gets the minimum grouping error. This method makes up the lack of ability in determining the number of groups in previous methods. Enlightened by the idea of AIC, this paper treats the data of the same group as samples of certain distribution. In this way, it determines the number of groups by seeking the asymptotically unbiased estimate of Kullback-Leibler information. At the same time, aroused by the phenomenon that in practice, the number of data isn't always large enough but there exist history data, this paper expands AIC to the question where each data group comes from an AR (p) model (p known) and proposes AlC^R method. This paper is conceived under the background of dividing Chinese (mainland) economic zones. Using the data of GDP and Per Capital GDP, the last part of the article applies some of the methods to the question of division. By treating and analyzing data, the zones are decided and the results are given.
Keywords/Search Tags:the number of groups, grouping error, AIC_g AIC_AR, economic zone planning
PDF Full Text Request
Related items